linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Btrfs: hold enough space for global_rsv
@ 2012-01-17  9:51 Liu Bo
  2012-01-27 15:25 ` Chris Mason
  2012-02-27 13:29 ` Johannes Hirte
  0 siblings, 2 replies; 8+ messages in thread
From: Liu Bo @ 2012-01-17  9:51 UTC (permalink / raw)
  To: linux-btrfs

I've kept hitting enospc warnings of global_rsv while running defragment on
files:
btrfs: block rsv returned -28
WARNING: at fs/btrfs/extent-tree.c:5984 btrfs_alloc_free_block+0x333/0x340 [btrfs]()
...

I used a fio jobs to create a file with lots of fragments:
$ filefrag /mnt/btrfs/foobar
/mnt/btrfs/foobar: 66964 extents found

and then "btrfs fi defrag /mnt/btrfs/foobar && sync" would pop the warnings.

I found that the global_rsv size is just not enough for defragment, and didn't
find any space leak in using global_rsv, so double it and go ahead.

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 8603ee4..77ea23c 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3979,7 +3979,7 @@ static u64 calc_global_metadata_size(struct btrfs_fs_info *fs_info)
 	num_bytes += div64_u64(data_used + meta_used, 50);
 
 	if (num_bytes * 3 > meta_used)
-		num_bytes = div64_u64(meta_used, 3);
+		num_bytes = div64_u64(meta_used, 3) * 2;
 
 	return ALIGN(num_bytes, fs_info->extent_root->leafsize << 10);
 }
-- 
1.6.5.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] Btrfs: hold enough space for global_rsv
  2012-01-17  9:51 [PATCH] Btrfs: hold enough space for global_rsv Liu Bo
@ 2012-01-27 15:25 ` Chris Mason
  2012-02-27 13:29 ` Johannes Hirte
  1 sibling, 0 replies; 8+ messages in thread
From: Chris Mason @ 2012-01-27 15:25 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

On Tue, Jan 17, 2012 at 05:51:59PM +0800, Liu Bo wrote:
> I've kept hitting enospc warnings of global_rsv while running defragment on
> files:
> btrfs: block rsv returned -28
> WARNING: at fs/btrfs/extent-tree.c:5984 btrfs_alloc_free_block+0x333/0x340 [btrfs]()
> ...
> 
> I used a fio jobs to create a file with lots of fragments:
> $ filefrag /mnt/btrfs/foobar
> /mnt/btrfs/foobar: 66964 extents found
> 
> and then "btrfs fi defrag /mnt/btrfs/foobar && sync" would pop the warnings.
> 
> I found that the global_rsv size is just not enough for defragment, and didn't
> find any space leak in using global_rsv, so double it and go ahead.

I haven't pulled this one in yet, mostly because I think we need to take
a step back and look harder at the numbers.

-chris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Btrfs: hold enough space for global_rsv
  2012-01-17  9:51 [PATCH] Btrfs: hold enough space for global_rsv Liu Bo
  2012-01-27 15:25 ` Chris Mason
@ 2012-02-27 13:29 ` Johannes Hirte
  2012-02-28  2:06   ` Liu Bo
  1 sibling, 1 reply; 8+ messages in thread
From: Johannes Hirte @ 2012-02-27 13:29 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

Am Tue, 17 Jan 2012 17:51:59 +0800
schrieb Liu Bo <liubo2009@cn.fujitsu.com>:

> I've kept hitting enospc warnings of global_rsv while running
> defragment on files:
> btrfs: block rsv returned -28
> WARNING: at fs/btrfs/extent-tree.c:5984
> btrfs_alloc_free_block+0x333/0x340 [btrfs]() ...
> 
> I used a fio jobs to create a file with lots of fragments:
> $ filefrag /mnt/btrfs/foobar
> /mnt/btrfs/foobar: 66964 extents found
> 
> and then "btrfs fi defrag /mnt/btrfs/foobar && sync" would pop the
> warnings.
> 
> I found that the global_rsv size is just not enough for defragment,
> and didn't find any space leak in using global_rsv, so double it and
> go ahead.
> 
> Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
> ---
>  fs/btrfs/extent-tree.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 8603ee4..77ea23c 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3979,7 +3979,7 @@ static u64 calc_global_metadata_size(struct
> btrfs_fs_info *fs_info) num_bytes += div64_u64(data_used + meta_used,
> 50); 
>  	if (num_bytes * 3 > meta_used)
> -		num_bytes = div64_u64(meta_used, 3);
> +		num_bytes = div64_u64(meta_used, 3) * 2;
>  
>  	return ALIGN(num_bytes, fs_info->extent_root->leafsize <<
> 10); }

This patch breakes my system. With this applied all services fail on
boot with "no space left" messages.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Btrfs: hold enough space for global_rsv
  2012-02-27 13:29 ` Johannes Hirte
@ 2012-02-28  2:06   ` Liu Bo
  2012-03-06 13:50     ` Johannes Hirte
  0 siblings, 1 reply; 8+ messages in thread
From: Liu Bo @ 2012-02-28  2:06 UTC (permalink / raw)
  To: Johannes Hirte; +Cc: linux-btrfs

On 02/27/2012 09:29 PM, Johannes Hirte wrote:
> Am Tue, 17 Jan 2012 17:51:59 +0800
> schrieb Liu Bo <liubo2009@cn.fujitsu.com>:
> 
>> I've kept hitting enospc warnings of global_rsv while running
>> defragment on files:
>> btrfs: block rsv returned -28
>> WARNING: at fs/btrfs/extent-tree.c:5984
>> btrfs_alloc_free_block+0x333/0x340 [btrfs]() ...
>>
>> I used a fio jobs to create a file with lots of fragments:
>> $ filefrag /mnt/btrfs/foobar
>> /mnt/btrfs/foobar: 66964 extents found
>>
>> and then "btrfs fi defrag /mnt/btrfs/foobar && sync" would pop the
>> warnings.
>>
>> I found that the global_rsv size is just not enough for defragment,
>> and didn't find any space leak in using global_rsv, so double it and
>> go ahead.
>>
>> Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
>> ---
>>  fs/btrfs/extent-tree.c |    2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index 8603ee4..77ea23c 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -3979,7 +3979,7 @@ static u64 calc_global_metadata_size(struct
>> btrfs_fs_info *fs_info) num_bytes += div64_u64(data_used + meta_used,
>> 50); 
>>  	if (num_bytes * 3 > meta_used)
>> -		num_bytes = div64_u64(meta_used, 3);
>> +		num_bytes = div64_u64(meta_used, 3) * 2;
>>  
>>  	return ALIGN(num_bytes, fs_info->extent_root->leafsize <<
>> 10); }
> 
> This patch breakes my system. With this applied all services fail on
> boot with "no space left" messages.
> 

It's weird since this patch is just aiming to enlarge our metadata reservation count.

so you've tried a revert or a bisect, right?  Can you show me the environment or any log messages?

thanks,
liubo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Btrfs: hold enough space for global_rsv
  2012-02-28  2:06   ` Liu Bo
@ 2012-03-06 13:50     ` Johannes Hirte
  2012-03-08 19:22       ` Johannes Hirte
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Hirte @ 2012-03-06 13:50 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

Am Tue, 28 Feb 2012 10:06:14 +0800
schrieb Liu Bo <liubo2009@cn.fujitsu.com>:

> On 02/27/2012 09:29 PM, Johannes Hirte wrote:
> > Am Tue, 17 Jan 2012 17:51:59 +0800
> > schrieb Liu Bo <liubo2009@cn.fujitsu.com>:
> > 
> >> I've kept hitting enospc warnings of global_rsv while running
> >> defragment on files:
> >> btrfs: block rsv returned -28
> >> WARNING: at fs/btrfs/extent-tree.c:5984
> >> btrfs_alloc_free_block+0x333/0x340 [btrfs]() ...
> >>
> >> I used a fio jobs to create a file with lots of fragments:
> >> $ filefrag /mnt/btrfs/foobar
> >> /mnt/btrfs/foobar: 66964 extents found
> >>
> >> and then "btrfs fi defrag /mnt/btrfs/foobar && sync" would pop the
> >> warnings.
> >>
> >> I found that the global_rsv size is just not enough for defragment,
> >> and didn't find any space leak in using global_rsv, so double it
> >> and go ahead.
> >>
> >> Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
> >> ---
> >>  fs/btrfs/extent-tree.c |    2 +-
> >>  1 files changed, 1 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> >> index 8603ee4..77ea23c 100644
> >> --- a/fs/btrfs/extent-tree.c
> >> +++ b/fs/btrfs/extent-tree.c
> >> @@ -3979,7 +3979,7 @@ static u64 calc_global_metadata_size(struct
> >> btrfs_fs_info *fs_info) num_bytes += div64_u64(data_used +
> >> meta_used, 50); 
> >>  	if (num_bytes * 3 > meta_used)
> >> -		num_bytes = div64_u64(meta_used, 3);
> >> +		num_bytes = div64_u64(meta_used, 3) * 2;
> >>  
> >>  	return ALIGN(num_bytes, fs_info->extent_root->leafsize <<
> >> 10); }
> > 
> > This patch breakes my system. With this applied all services fail on
> > boot with "no space left" messages.
> > 
> 
> It's weird since this patch is just aiming to enlarge our metadata
> reservation count.
> 
> so you've tried a revert or a bisect, right?  Can you show me the
> environment or any log messages?
> 
> thanks,
> liubo

Sorry for the long delay. My system was really screwed up and
took time to fix it.
First, it wasn't your patch that made the system fail. At this time, it
was the first revision that didn't work anymore. I don't know why this
one. Short time later also earlier revisions showed that error. I was
able to boot with a live system from USB stick. The filesystem was
mountable and readable, but I couldn't modify or create a single file.
Two or three times I got a

btrfs: fail to dirty inode 256 error -28

but most times nothing was reported in the logs.

The filesystem consists of three subvolumes, the default one, one for
rootfs and one for home. If I did a defrag on the rootfs, I was able to
create files. But after unmounting and remounting the filesystem, the
same error appeared again. Also a balance of the filesystem resulted in
no space error after some time.
I've backed up the filesystem, deleted the subvolumes, recreated them
and copied the data back. Now everything seems to work again. I've also
a full image of the damaged filesystem for further investigation. If
someone has an idea for testing, I'm happy to try it.


regards,
  Johannes

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Btrfs: hold enough space for global_rsv
  2012-03-06 13:50     ` Johannes Hirte
@ 2012-03-08 19:22       ` Johannes Hirte
  2012-03-09  1:28         ` Liu Bo
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Hirte @ 2012-03-08 19:22 UTC (permalink / raw)
  To: linux-btrfs

Am Tue, 6 Mar 2012 14:50:32 +0100
schrieb Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>:

> I've backed up the filesystem, deleted the subvolumes, recreated them
> and copied the data back. Now everything seems to work again. I've
> also a full image of the damaged filesystem for further
> investigation. If someone has an idea for testing, I'm happy to try
> it.

It's much worse than I thought. After a short time the same error
happened again (no space left on device). So recreated the filesystem
(mkbtrfs with default values) and copied the data from the backup back,
but the error still came back. I'm now on kernel 3.2 which seems to
work. I'll try to bisect the bad commit. For info, df says:

Filesystem      Size  Used Avail Use% Mounted on
rootfs          200G  128G   69G  66% /
/dev/sda1       200G  128G   69G  66% /
rc-svcdir       1.0M  128K  896K  13% /lib64/rc/init.d
cgroup_root      10M   52K   10M   1% /sys/fs/cgroup
udev             10M  168K  9.9M   2% /dev
shm             2.0G     0  2.0G   0% /dev/shm
/dev/sda1       200G  128G   69G  66% /home

and btrfs fi df:

Data: total=149.01GB, used=118.57GB
System, DUP: total=8.00MB, used=24.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=6.38GB, used=4.55GB
Metadata: total=8.00MB, used=0.0

Kernel 3.3-rc6 fails on this with "no space left on device".

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Btrfs: hold enough space for global_rsv
  2012-03-08 19:22       ` Johannes Hirte
@ 2012-03-09  1:28         ` Liu Bo
  2012-03-10 20:12           ` Johannes Hirte
  0 siblings, 1 reply; 8+ messages in thread
From: Liu Bo @ 2012-03-09  1:28 UTC (permalink / raw)
  To: Johannes Hirte; +Cc: linux-btrfs

On 03/09/2012 03:22 AM, Johannes Hirte wrote:
> Am Tue, 6 Mar 2012 14:50:32 +0100
> schrieb Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>:
> 
>> I've backed up the filesystem, deleted the subvolumes, recreated them
>> and copied the data back. Now everything seems to work again. I've
>> also a full image of the damaged filesystem for further
>> investigation. If someone has an idea for testing, I'm happy to try
>> it.
> 
> It's much worse than I thought. After a short time the same error
> happened again (no space left on device). So recreated the filesystem
> (mkbtrfs with default values) and copied the data from the backup back,
> but the error still came back. I'm now on kernel 3.2 which seems to
> work. I'll try to bisect the bad commit. For info, df says:
> 

OK, plz show us the results after your bisect, let's narrow down where goes wrong.

thanks,
liubo

> Filesystem      Size  Used Avail Use% Mounted on
> rootfs          200G  128G   69G  66% /
> /dev/sda1       200G  128G   69G  66% /
> rc-svcdir       1.0M  128K  896K  13% /lib64/rc/init.d
> cgroup_root      10M   52K   10M   1% /sys/fs/cgroup
> udev             10M  168K  9.9M   2% /dev
> shm             2.0G     0  2.0G   0% /dev/shm
> /dev/sda1       200G  128G   69G  66% /home
> 
> and btrfs fi df:
> 
> Data: total=149.01GB, used=118.57GB
> System, DUP: total=8.00MB, used=24.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=6.38GB, used=4.55GB
> Metadata: total=8.00MB, used=0.0
> 
> Kernel 3.3-rc6 fails on this with "no space left on device".
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Btrfs: hold enough space for global_rsv
  2012-03-09  1:28         ` Liu Bo
@ 2012-03-10 20:12           ` Johannes Hirte
  0 siblings, 0 replies; 8+ messages in thread
From: Johannes Hirte @ 2012-03-10 20:12 UTC (permalink / raw)
  To: Liu Bo; +Cc: linux-btrfs

Am Fri, 09 Mar 2012 09:28:56 +0800
schrieb Liu Bo <liubo2009@cn.fujitsu.com>:

> On 03/09/2012 03:22 AM, Johannes Hirte wrote:
> > Am Tue, 6 Mar 2012 14:50:32 +0100
> > schrieb Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>:
> > 
> >> I've backed up the filesystem, deleted the subvolumes, recreated
> >> them and copied the data back. Now everything seems to work again.
> >> I've also a full image of the damaged filesystem for further
> >> investigation. If someone has an idea for testing, I'm happy to try
> >> it.
> > 
> > It's much worse than I thought. After a short time the same error
> > happened again (no space left on device). So recreated the
> > filesystem (mkbtrfs with default values) and copied the data from
> > the backup back, but the error still came back. I'm now on kernel
> > 3.2 which seems to work. I'll try to bisect the bad commit. For
> > info, df says:
> > 
> 
> OK, plz show us the results after your bisect, let's narrow down
> where goes wrong.
> 
> thanks,
> liubo

Bisect points again to:

5500cdbe14d7435e04f66ff3cfb8ecd8b8e44ebf is the first bad commit
commit 5500cdbe14d7435e04f66ff3cfb8ecd8b8e44ebf
Author: Liu Bo <liubo2009@cn.fujitsu.com>
Date:   Thu Feb 23 10:49:04 2012 -0500

    Btrfs: increase the global block reserve estimates
    
    When doing IO with large amounts of data fragmentation, the global
    block reserve calulations are too low.  This increases them to avoid
    ENOSPC crashes.
    
    Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
    Signed-off-by: Chris Mason <chris.mason@oracle.com>

The revision before is working and reverting this commit from master
works too. But as mentioned before, I'm not sure if this is root cause.
First time I've seen the error it happened without this patch too later
on.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-03-10 20:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-17  9:51 [PATCH] Btrfs: hold enough space for global_rsv Liu Bo
2012-01-27 15:25 ` Chris Mason
2012-02-27 13:29 ` Johannes Hirte
2012-02-28  2:06   ` Liu Bo
2012-03-06 13:50     ` Johannes Hirte
2012-03-08 19:22       ` Johannes Hirte
2012-03-09  1:28         ` Liu Bo
2012-03-10 20:12           ` Johannes Hirte

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).