btrfs: qgroup scan failed with -12

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* btrfs: qgroup scan failed with -12
@ 2013-09-23  0:43 Tomasz Chmielewski
  2013-09-23 13:00 ` Wang Shilong
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Tomasz Chmielewski @ 2013-09-23  0:43 UTC (permalink / raw)
  To: linux-btrfs@vger.kernel.org

Not sure if it's anything interesting - I had the following entry in
dmesg a few days ago, on a server with 32 GB RAM. The system is still working fine.

[1878432.675210] btrfs-qgroup-re: page allocation failure: order:5, mode:0x104050
[1878432.675319] CPU: 5 PID: 22251 Comm: btrfs-qgroup-re Not tainted 3.11.0-rc7 #2
[1878432.675417] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 1106 10/17/2011
[1878432.675517]  0000000000104050 ffff88062a981948 ffffffff81378c52 ffff88083fb4d958
[1878432.675618]  0000000000000001 ffff88062a9819d8 ffffffff810af9a4 0000000000000000
[1878432.675721]  ffffffff817c8100 ffff88062a981978 ffff88083fddea00 ffff88083fddea38
[1878432.675821] Call Trace:
[1878432.675874]  [<ffffffff81378c52>] dump_stack+0x46/0x58
[1878432.675927]  [<ffffffff810af9a4>] warn_alloc_failed+0x110/0x124
[1878432.675981]  [<ffffffff810b1fd8>] __alloc_pages_nodemask+0x6a4/0x793
[1878432.676036]  [<ffffffff810db7e8>] alloc_pages_current+0xc8/0xe5
[1878432.676098]  [<ffffffff810af06c>] __get_free_pages+0x9/0x36
[1878432.676150]  [<ffffffff810e27b9>] __kmalloc_track_caller+0x35/0x163
[1878432.676204]  [<ffffffff810bde12>] krealloc+0x52/0x8c
[1878432.676265]  [<ffffffffa036cdcb>] ulist_add_merge+0xe1/0x14e [btrfs]
[1878432.676324]  [<ffffffffa036bcf0>] find_parent_nodes+0x49c/0x5a5 [btrfs]
[1878432.676383]  [<ffffffffa036be75>] btrfs_find_all_roots+0x7c/0xd7 [btrfs]
[1878432.676441]  [<ffffffffa036d6e1>] ? qgroup_account_ref_step1+0xea/0x102 [btrfs]
[1878432.676542]  [<ffffffffa036d915>] btrfs_qgroup_rescan_worker+0x21c/0x516 [btrfs]
[1878432.676645]  [<ffffffffa03482cc>] worker_loop+0x15e/0x48e [btrfs]
[1878432.676702]  [<ffffffffa034816e>] ? btrfs_queue_worker+0x267/0x267 [btrfs]
[1878432.676757]  [<ffffffff8104e51a>] kthread+0xb5/0xbd
[1878432.676809]  [<ffffffff8104e465>] ? kthread_freezable_should_stop+0x43/0x43
[1878432.676881]  [<ffffffff8137da2c>] ret_from_fork+0x7c/0xb0
[1878432.676950]  [<ffffffff8104e465>] ? kthread_freezable_should_stop+0x43/0x43
[1878432.677004] Mem-Info:
[1878432.678293] Node 0 DMA per-cpu:
[1878432.678341] CPU    0: hi:    0, btch:   1 usd:   0
[1878432.678392] CPU    1: hi:    0, btch:   1 usd:   0
[1878432.678443] CPU    2: hi:    0, btch:   1 usd:   0
[1878432.678494] CPU    3: hi:    0, btch:   1 usd:   0
[1878432.678544] CPU    4: hi:    0, btch:   1 usd:   0
[1878432.678595] CPU    5: hi:    0, btch:   1 usd:   0
[1878432.678646] CPU    6: hi:    0, btch:   1 usd:   0
[1878432.678697] CPU    7: hi:    0, btch:   1 usd:   0
[1878432.678747] Node 0 DMA32 per-cpu:
[1878432.678797] CPU    0: hi:  186, btch:  31 usd:   0
[1878432.678847] CPU    1: hi:  186, btch:  31 usd:   0
[1878432.678897] CPU    2: hi:  186, btch:  31 usd:   0
[1878432.678948] CPU    3: hi:  186, btch:  31 usd:   0
[1878432.678998] CPU    4: hi:  186, btch:  31 usd:   0
[1878432.679049] CPU    5: hi:  186, btch:  31 usd:   0
[1878432.679111] CPU    6: hi:  186, btch:  31 usd:   0
[1878432.679162] CPU    7: hi:  186, btch:  31 usd:   2
[1878432.679214] Node 0 Normal per-cpu:
[1878432.679270] CPU    0: hi:  186, btch:  31 usd:  31
[1878432.679321] CPU    1: hi:  186, btch:  31 usd:   0
[1878432.679372] CPU    2: hi:  186, btch:  31 usd:   0
[1878432.679444] CPU    3: hi:  186, btch:  31 usd:   0
[1878432.679495] CPU    4: hi:  186, btch:  31 usd:   0
[1878432.679546] CPU    5: hi:  186, btch:  31 usd:   0
[1878432.679596] CPU    6: hi:  186, btch:  31 usd:  30
[1878432.679647] CPU    7: hi:  186, btch:  31 usd: 169
[1878432.679700] active_anon:1062992 inactive_anon:522620 isolated_anon:0
[1878432.679700]  active_file:1050823 inactive_file:1052143 isolated_file:0
[1878432.679700]  unevictable:5892 dirty:1004 writeback:0 unstable:0
[1878432.679700]  free:2184567 slab_reclaimable:1142980 slab_unreclaimable:86405
[1878432.679700]  mapped:754301 shmem:9119 pagetables:21518 bounce:0
[1878432.679700]  free_cma:0
[1878432.680007] Node 0 DMA free:15360kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15360kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[1878432.680322] lowmem_reserve[]: 0 2897 32077 32077
[1878432.680375] Node 0 DMA32 free:417660kB min:2068kB low:2584kB high:3100kB active_anon:4384kB inactive_anon:59696kB active_file:45640kB inactive_file:45652kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3040872kB managed:2968600kB mlocked:0kB dirty:4kB writeback:0kB mapped:224kB shmem:324kB slab_reclaimable:2333732kB slab_unreclaimable:59700kB kernel_stack:8kB pagetables:1708kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:4 all_unreclaimable? no
[1878432.680771] lowmem_reserve[]: 0 0 29180 29180
[1878432.680831] Node 0 Normal free:8305248kB min:20844kB low:26052kB high:31264kB active_anon:4247584kB inactive_anon:2030784kB active_file:4157652kB inactive_file:4162920kB unevictable:23568kB isolated(anon):0kB isolated(file):0kB present:30406656kB managed:29880648kB mlocked:23568kB dirty:4012kB writeback:0kB mapped:3016980kB shmem:36152kB slab_reclaimable:2238188kB slab_unreclaimable:285920kB kernel_stack:3328kB pagetables:84364kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[1878432.681207] lowmem_reserve[]: 0 0 0 0
[1878432.681258] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15360kB
[1878432.681371] Node 0 DMA32: 19878*4kB (UEM) 18133*8kB (UEM) 12056*16kB (UEM) 1*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 417504kB
[1878432.681486] Node 0 Normal: 544554*4kB (EM) 536220*8kB (UEM) 114800*16kB (UM) 1*32kB (M) 2*64kB (R) 2*128kB (R) 1*256kB (R) 0*512kB 1*1024kB (R) 0*2048kB 0*4096kB = 8306472kB
[1878432.681649] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[1878432.681747] 2186705 total pagecache pages
[1878432.681797] 73757 pages in swap cache
[1878432.681846] Swap cache stats: add 8877981, delete 8804224, find 97677523/98270314
[1878432.681943] Free swap  = 32338688kB
[1878432.681992] Total swap = 33553332kB
[1878432.762593] 8388095 pages RAM
[1878432.762659] 171943 pages reserved
[1878432.762716] 4144353 pages shared
[1878432.762764] 3906447 pages non-shared
[1878432.762841] btrfs: qgroup scan failed with -12


-- 
Tomasz Chmielewski


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs: qgroup scan failed with -12
  2013-09-23  0:43 btrfs: qgroup scan failed with -12 Tomasz Chmielewski
@ 2013-09-23 13:00 ` Wang Shilong
  2013-09-23 14:57 ` Josef Bacik
  2013-09-23 17:19 ` David Sterba
  2 siblings, 0 replies; 7+ messages in thread
From: Wang Shilong @ 2013-09-23 13:00 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: linux-btrfs@vger.kernel.org

Hello,

I think this problem may be related to qgroup memory leak that you also reported before,
however, i have not reproduced it in my test box.

By the way,  did you machine still exist high memory cost with quota enabled?

Thanks,
Wang
> Not sure if it's anything interesting - I had the following entry in
> dmesg a few days ago, on a server with 32 GB RAM. The system is still working fine.
> 
> [1878432.675210] btrfs-qgroup-re: page allocation failure: order:5, mode:0x104050
> [1878432.675319] CPU: 5 PID: 22251 Comm: btrfs-qgroup-re Not tainted 3.11.0-rc7 #2
> [1878432.675417] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 1106 10/17/2011
> [1878432.675517]  0000000000104050 ffff88062a981948 ffffffff81378c52 ffff88083fb4d958
> [1878432.675618]  0000000000000001 ffff88062a9819d8 ffffffff810af9a4 0000000000000000
> [1878432.675721]  ffffffff817c8100 ffff88062a981978 ffff88083fddea00 ffff88083fddea38
> [1878432.675821] Call Trace:
> [1878432.675874]  [<ffffffff81378c52>] dump_stack+0x46/0x58
> [1878432.675927]  [<ffffffff810af9a4>] warn_alloc_failed+0x110/0x124
> [1878432.675981]  [<ffffffff810b1fd8>] __alloc_pages_nodemask+0x6a4/0x793
> [1878432.676036]  [<ffffffff810db7e8>] alloc_pages_current+0xc8/0xe5
> [1878432.676098]  [<ffffffff810af06c>] __get_free_pages+0x9/0x36
> [1878432.676150]  [<ffffffff810e27b9>] __kmalloc_track_caller+0x35/0x163
> [1878432.676204]  [<ffffffff810bde12>] krealloc+0x52/0x8c
> [1878432.676265]  [<ffffffffa036cdcb>] ulist_add_merge+0xe1/0x14e [btrfs]
> [1878432.676324]  [<ffffffffa036bcf0>] find_parent_nodes+0x49c/0x5a5 [btrfs]
> [1878432.676383]  [<ffffffffa036be75>] btrfs_find_all_roots+0x7c/0xd7 [btrfs]
> [1878432.676441]  [<ffffffffa036d6e1>] ? qgroup_account_ref_step1+0xea/0x102 [btrfs]
> [1878432.676542]  [<ffffffffa036d915>] btrfs_qgroup_rescan_worker+0x21c/0x516 [btrfs]
> [1878432.676645]  [<ffffffffa03482cc>] worker_loop+0x15e/0x48e [btrfs]
> [1878432.676702]  [<ffffffffa034816e>] ? btrfs_queue_worker+0x267/0x267 [btrfs]
> [1878432.676757]  [<ffffffff8104e51a>] kthread+0xb5/0xbd
> [1878432.676809]  [<ffffffff8104e465>] ? kthread_freezable_should_stop+0x43/0x43
> [1878432.676881]  [<ffffffff8137da2c>] ret_from_fork+0x7c/0xb0
> [1878432.676950]  [<ffffffff8104e465>] ? kthread_freezable_should_stop+0x43/0x43
> [1878432.677004] Mem-Info:
> [1878432.678293] Node 0 DMA per-cpu:
> [1878432.678341] CPU    0: hi:    0, btch:   1 usd:   0
> [1878432.678392] CPU    1: hi:    0, btch:   1 usd:   0
> [1878432.678443] CPU    2: hi:    0, btch:   1 usd:   0
> [1878432.678494] CPU    3: hi:    0, btch:   1 usd:   0
> [1878432.678544] CPU    4: hi:    0, btch:   1 usd:   0
> [1878432.678595] CPU    5: hi:    0, btch:   1 usd:   0
> [1878432.678646] CPU    6: hi:    0, btch:   1 usd:   0
> [1878432.678697] CPU    7: hi:    0, btch:   1 usd:   0
> [1878432.678747] Node 0 DMA32 per-cpu:
> [1878432.678797] CPU    0: hi:  186, btch:  31 usd:   0
> [1878432.678847] CPU    1: hi:  186, btch:  31 usd:   0
> [1878432.678897] CPU    2: hi:  186, btch:  31 usd:   0
> [1878432.678948] CPU    3: hi:  186, btch:  31 usd:   0
> [1878432.678998] CPU    4: hi:  186, btch:  31 usd:   0
> [1878432.679049] CPU    5: hi:  186, btch:  31 usd:   0
> [1878432.679111] CPU    6: hi:  186, btch:  31 usd:   0
> [1878432.679162] CPU    7: hi:  186, btch:  31 usd:   2
> [1878432.679214] Node 0 Normal per-cpu:
> [1878432.679270] CPU    0: hi:  186, btch:  31 usd:  31
> [1878432.679321] CPU    1: hi:  186, btch:  31 usd:   0
> [1878432.679372] CPU    2: hi:  186, btch:  31 usd:   0
> [1878432.679444] CPU    3: hi:  186, btch:  31 usd:   0
> [1878432.679495] CPU    4: hi:  186, btch:  31 usd:   0
> [1878432.679546] CPU    5: hi:  186, btch:  31 usd:   0
> [1878432.679596] CPU    6: hi:  186, btch:  31 usd:  30
> [1878432.679647] CPU    7: hi:  186, btch:  31 usd: 169
> [1878432.679700] active_anon:1062992 inactive_anon:522620 isolated_anon:0
> [1878432.679700]  active_file:1050823 inactive_file:1052143 isolated_file:0
> [1878432.679700]  unevictable:5892 dirty:1004 writeback:0 unstable:0
> [1878432.679700]  free:2184567 slab_reclaimable:1142980 slab_unreclaimable:86405
> [1878432.679700]  mapped:754301 shmem:9119 pagetables:21518 bounce:0
> [1878432.679700]  free_cma:0
> [1878432.680007] Node 0 DMA free:15360kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15360kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
> [1878432.680322] lowmem_reserve[]: 0 2897 32077 32077
> [1878432.680375] Node 0 DMA32 free:417660kB min:2068kB low:2584kB high:3100kB active_anon:4384kB inactive_anon:59696kB active_file:45640kB inactive_file:45652kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3040872kB managed:2968600kB mlocked:0kB dirty:4kB writeback:0kB mapped:224kB shmem:324kB slab_reclaimable:2333732kB slab_unreclaimable:59700kB kernel_stack:8kB pagetables:1708kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:4 all_unreclaimable? no
> [1878432.680771] lowmem_reserve[]: 0 0 29180 29180
> [1878432.680831] Node 0 Normal free:8305248kB min:20844kB low:26052kB high:31264kB active_anon:4247584kB inactive_anon:2030784kB active_file:4157652kB inactive_file:4162920kB unevictable:23568kB isolated(anon):0kB isolated(file):0kB present:30406656kB managed:29880648kB mlocked:23568kB dirty:4012kB writeback:0kB mapped:3016980kB shmem:36152kB slab_reclaimable:2238188kB slab_unreclaimable:285920kB kernel_stack:3328kB pagetables:84364kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [1878432.681207] lowmem_reserve[]: 0 0 0 0
> [1878432.681258] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15360kB
> [1878432.681371] Node 0 DMA32: 19878*4kB (UEM) 18133*8kB (UEM) 12056*16kB (UEM) 1*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 417504kB
> [1878432.681486] Node 0 Normal: 544554*4kB (EM) 536220*8kB (UEM) 114800*16kB (UM) 1*32kB (M) 2*64kB (R) 2*128kB (R) 1*256kB (R) 0*512kB 1*1024kB (R) 0*2048kB 0*4096kB = 8306472kB
> [1878432.681649] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> [1878432.681747] 2186705 total pagecache pages
> [1878432.681797] 73757 pages in swap cache
> [1878432.681846] Swap cache stats: add 8877981, delete 8804224, find 97677523/98270314
> [1878432.681943] Free swap  = 32338688kB
> [1878432.681992] Total swap = 33553332kB
> [1878432.762593] 8388095 pages RAM
> [1878432.762659] 171943 pages reserved
> [1878432.762716] 4144353 pages shared
> [1878432.762764] 3906447 pages non-shared
> [1878432.762841] btrfs: qgroup scan failed with -12
> 
> 
> -- 
> Tomasz Chmielewski
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs: qgroup scan failed with -12
  2013-09-23  0:43 btrfs: qgroup scan failed with -12 Tomasz Chmielewski
  2013-09-23 13:00 ` Wang Shilong
@ 2013-09-23 14:57 ` Josef Bacik
  2013-09-23 17:19 ` David Sterba
  2 siblings, 0 replies; 7+ messages in thread
From: Josef Bacik @ 2013-09-23 14:57 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: linux-btrfs@vger.kernel.org

On Mon, Sep 23, 2013 at 07:43:44AM +0700, Tomasz Chmielewski wrote:
> Not sure if it's anything interesting - I had the following entry in
> dmesg a few days ago, on a server with 32 GB RAM. The system is still working fine.
> 

Can you try the patch here

https://bugzilla.kernel.org/attachment.cgi?id=107408&action=diff

and see if that helps?  Thanks,

Josef

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs: qgroup scan failed with -12
  2013-09-23  0:43 btrfs: qgroup scan failed with -12 Tomasz Chmielewski
  2013-09-23 13:00 ` Wang Shilong
  2013-09-23 14:57 ` Josef Bacik
@ 2013-09-23 17:19 ` David Sterba
  2013-09-23 17:43   ` Josef Bacik
  2 siblings, 1 reply; 7+ messages in thread
From: David Sterba @ 2013-09-23 17:19 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: linux-btrfs@vger.kernel.org

On Mon, Sep 23, 2013 at 07:43:44AM +0700, Tomasz Chmielewski wrote:
> Not sure if it's anything interesting - I had the following entry in
> dmesg a few days ago, on a server with 32 GB RAM. The system is still working fine.

Yes this is interesting of course.
> 
> [1878432.675210] btrfs-qgroup-re: page allocation failure: order:5, mode:0x104050

Order 5 allocation, not guaranteed to succeed.

> [1878432.675319] CPU: 5 PID: 22251 Comm: btrfs-qgroup-re Not tainted 3.11.0-rc7 #2
> [1878432.676204]  [<ffffffff810bde12>] krealloc+0x52/0x8c
> [1878432.676324]  [<ffffffffa036bcf0>] find_parent_nodes+0x49c/0x5a5 [btrfs]
> [1878432.676383]  [<ffffffffa036be75>] btrfs_find_all_roots+0x7c/0xd7 [btrfs]
> [1878432.676441]  [<ffffffffa036d6e1>] ? qgroup_account_ref_step1+0xea/0x102 [btrfs]
> [1878432.676542]  [<ffffffffa036d915>] btrfs_qgroup_rescan_worker+0x21c/0x516 [btrfs]

220                 new_nodes = krealloc(old, sizeof(*new_nodes) * new_alloced,
221                                      gfp_mask);
222                 if (!new_nodes)
223                         return -ENOMEM;

The requested size is between 64k and 128k, with 40 bytes of ulist_node
it's 1638 to 3276 elements. So, lots of things going on during the
rescan, quite expectable.

I don't know if krealloc can be replaced with something more friendly to
allocator, eg. a list of page-sized blocks instead of one contiguous
array.

david

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs: qgroup scan failed with -12
  2013-09-23 17:19 ` David Sterba
@ 2013-09-23 17:43   ` Josef Bacik
  2013-09-25 10:11     ` David Sterba
  0 siblings, 1 reply; 7+ messages in thread
From: Josef Bacik @ 2013-09-23 17:43 UTC (permalink / raw)
  To: dsterba, Tomasz Chmielewski, linux-btrfs@vger.kernel.org

On Mon, Sep 23, 2013 at 07:19:06PM +0200, David Sterba wrote:
> On Mon, Sep 23, 2013 at 07:43:44AM +0700, Tomasz Chmielewski wrote:
> > Not sure if it's anything interesting - I had the following entry in
> > dmesg a few days ago, on a server with 32 GB RAM. The system is still working fine.
> 
> Yes this is interesting of course.
> > 
> > [1878432.675210] btrfs-qgroup-re: page allocation failure: order:5, mode:0x104050
> 
> Order 5 allocation, not guaranteed to succeed.
> 
> > [1878432.675319] CPU: 5 PID: 22251 Comm: btrfs-qgroup-re Not tainted 3.11.0-rc7 #2
> > [1878432.676204]  [<ffffffff810bde12>] krealloc+0x52/0x8c
> > [1878432.676324]  [<ffffffffa036bcf0>] find_parent_nodes+0x49c/0x5a5 [btrfs]
> > [1878432.676383]  [<ffffffffa036be75>] btrfs_find_all_roots+0x7c/0xd7 [btrfs]
> > [1878432.676441]  [<ffffffffa036d6e1>] ? qgroup_account_ref_step1+0xea/0x102 [btrfs]
> > [1878432.676542]  [<ffffffffa036d915>] btrfs_qgroup_rescan_worker+0x21c/0x516 [btrfs]
> 
> 220                 new_nodes = krealloc(old, sizeof(*new_nodes) * new_alloced,
> 221                                      gfp_mask);
> 222                 if (!new_nodes)
> 223                         return -ENOMEM;
> 
> The requested size is between 64k and 128k, with 40 bytes of ulist_node
> it's 1638 to 3276 elements. So, lots of things going on during the
> rescan, quite expectable.
> 
> I don't know if krealloc can be replaced with something more friendly to
> allocator, eg. a list of page-sized blocks instead of one contiguous
> array.
> 

I've done that with a patch in bugzilla, hopefully that will fix it.  I've not
had time to try and reproduce myself, but I assume if you do something like
create a random file, then create 100000 snapshots and then defrag it will
probably hit the same problem.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs: qgroup scan failed with -12
  2013-09-23 17:43   ` Josef Bacik
@ 2013-09-25 10:11     ` David Sterba
  2013-09-25 11:30       ` Tomasz Chmielewski
  0 siblings, 1 reply; 7+ messages in thread
From: David Sterba @ 2013-09-25 10:11 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Tomasz Chmielewski, linux-btrfs@vger.kernel.org

On Mon, Sep 23, 2013 at 01:43:48PM -0400, Josef Bacik wrote:
> I've done that with a patch in bugzilla, hopefully that will fix it.  I've not
> had time to try and reproduce myself, but I assume if you do something like
> create a random file, then create 100000 snapshots and then defrag it will
> probably hit the same problem.  Thanks,

The patch looks ok.

I've tried to reproduce it with 100k subvolumes but was not lucky to hit
the allocation failure.


david

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: btrfs: qgroup scan failed with -12
  2013-09-25 10:11     ` David Sterba
@ 2013-09-25 11:30       ` Tomasz Chmielewski
  0 siblings, 0 replies; 7+ messages in thread
From: Tomasz Chmielewski @ 2013-09-25 11:30 UTC (permalink / raw)
  To: dsterba; +Cc: Josef Bacik, linux-btrfs@vger.kernel.org

On Wed, 25 Sep 2013 12:11:24 +0200
David Sterba <dsterba@suse.cz> wrote:

> On Mon, Sep 23, 2013 at 01:43:48PM -0400, Josef Bacik wrote:
> > I've done that with a patch in bugzilla, hopefully that will fix
> > it.  I've not had time to try and reproduce myself, but I assume if
> > you do something like create a random file, then create 100000
> > snapshots and then defrag it will probably hit the same problem.
> > Thanks,
> 
> The patch looks ok.
> 
> I've tried to reproduce it with 100k subvolumes but was not lucky to
> hit the allocation failure.

For me, it was quite easy (i.e. every few days) to hang the server with
32 GB RAM with the following:

- BackupPC running,

- fs mounted with noatime,compress-force=zlib

- extended inode refs enabled,

- skinny metadata extent refs enabled,

- qgroups enabled.

For those unfamiliar, BackupPC is a backup program.
It works by rsyncing data from (possibly many) remote systems.

Then, it "deduplicates" identical files, by hardlinking the files to
identical ones in its pool.
As a result, a system running BackupPC will have lots of files
(multiple revisions of backups, from many systems) with lots of
hardlinks.

Additionally, once per night, BackupPC scans its pool of hardlinked
files to find the ones with no hardlinks (meaning, file is unused, can
be removed); typically, the process takes many hours to finish.
I believe this is when my system with btrfs was hanging.

BackupPC doesn't use any specific btrfs features, like snapshots.

With the above, my system was hanging regularly, every few days.

I've disabled BackupPC and I'm running rsync + btrfs snapshot now - no
hang since then (at least till now). Could be coincidence, but who
knows.

[1] http://backuppc.sourceforge.net

-- 
Tomasz Chmielewski
http://wpkg.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-09-25 11:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-23  0:43 btrfs: qgroup scan failed with -12 Tomasz Chmielewski
2013-09-23 13:00 ` Wang Shilong
2013-09-23 14:57 ` Josef Bacik
2013-09-23 17:19 ` David Sterba
2013-09-23 17:43   ` Josef Bacik
2013-09-25 10:11     ` David Sterba
2013-09-25 11:30       ` Tomasz Chmielewski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).