* memory overflow or undeflow in free space tree / space_info?
@ 2016-07-29 18:40 Stefan Priebe - Profihost AG
2016-07-29 19:11 ` Omar Sandoval
0 siblings, 1 reply; 14+ messages in thread
From: Stefan Priebe - Profihost AG @ 2016-07-29 18:40 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org; +Cc: osandov
Dear list,
i'm seeing btrfs no space messages frequently on big filesystems (> 30TB).
In all cases i'm getting a trace like this one a space_info warning.
(since commit [1]). Could someone please be so kind and help me
debugging / fixing this bug? I'm using space_cache=v2 on all those systems.
------------[ cut here ]------------
WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5710
btrfs_free_block_groups+0x35a/0x400 [btrfs]()
Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
pps_core
CPU: 5 PID: 26421 Comm: umount Tainted: G W O 4.4.15+43-ph #1
Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
0000000000000000 ffff880ae8b47cd8 ffffffffbd3c712f 0000000000000000
ffffffffc03ec603 ffff880ae8b47d18 ffffffffbd0837e7 00000047a0000000
0000000000000000 ffff8806016a1400 ffff8808881d2088 ffff8808881d2000
Call Trace:
[<ffffffffbd3c712f>] dump_stack+0x63/0x84
[<ffffffffbd0837e7>] warn_slowpath_common+0x97/0xe0
[<ffffffffbd08384a>] warn_slowpath_null+0x1a/0x20
[<ffffffffc034a17a>] btrfs_free_block_groups+0x35a/0x400 [btrfs]
[<ffffffffc035ba4b>] close_ctree+0x15b/0x330 [btrfs]
[<ffffffffc03291f9>] btrfs_put_super+0x19/0x20 [btrfs]
[<ffffffffbd1cd33f>] generic_shutdown_super+0x6f/0x100
[<ffffffffbd1cd866>] kill_anon_super+0x16/0x30
[<ffffffffc032f96a>] btrfs_kill_super+0x1a/0xb0 [btrfs]
[<ffffffffbd1cda31>] deactivate_locked_super+0x51/0x90
[<ffffffffbd1ce42e>] deactivate_super+0x4e/0x70
[<ffffffffbd1e9373>] cleanup_mnt+0x43/0x90
[<ffffffffbd1e9412>] __cleanup_mnt+0x12/0x20
[<ffffffffbd09ef8e>] task_work_run+0x7e/0xa0
[<ffffffffbd07e550>] exit_to_usermode_loop+0x66/0x95
[<ffffffffbd002a56>] syscall_return_slowpath+0xa6/0xf0
[<ffffffffbd6b6f4c>] int_ret_from_sys_call+0x25/0x8f
---[ end trace bd985b05cc90617f ]---
------------[ cut here ]------------
WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5711
btrfs_free_block_groups+0x3f4/0x400 [btrfs]()
Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
pps_core
CPU: 5 PID: 26421 Comm: umount Tainted: G W O 4.4.15+43-ph #1
Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
0000000000000000 ffff880ae8b47cd8 ffffffffbd3c712f 0000000000000000
ffffffffc03ec603 ffff880ae8b47d18 ffffffffbd0837e7 00000047a0000000
0000000000000000 ffff8806016a1400 ffff8808881d2088 ffff8808881d2000
Call Trace:
[<ffffffffbd3c712f>] dump_stack+0x63/0x84
[<ffffffffbd0837e7>] warn_slowpath_common+0x97/0xe0
[<ffffffffbd08384a>] warn_slowpath_null+0x1a/0x20
[<ffffffffc034a214>] btrfs_free_block_groups+0x3f4/0x400 [btrfs]
[<ffffffffc035ba4b>] close_ctree+0x15b/0x330 [btrfs]
[<ffffffffc03291f9>] btrfs_put_super+0x19/0x20 [btrfs]
[<ffffffffbd1cd33f>] generic_shutdown_super+0x6f/0x100
[<ffffffffbd1cd866>] kill_anon_super+0x16/0x30
[<ffffffffc032f96a>] btrfs_kill_super+0x1a/0xb0 [btrfs]
[<ffffffffbd1cda31>] deactivate_locked_super+0x51/0x90
[<ffffffffbd1ce42e>] deactivate_super+0x4e/0x70
[<ffffffffbd1e9373>] cleanup_mnt+0x43/0x90
[<ffffffffbd1e9412>] __cleanup_mnt+0x12/0x20
[<ffffffffbd09ef8e>] task_work_run+0x7e/0xa0
[<ffffffffbd07e550>] exit_to_usermode_loop+0x66/0x95
[<ffffffffbd002a56>] syscall_return_slowpath+0xa6/0xf0
[<ffffffffbd6b6f4c>] int_ret_from_sys_call+0x25/0x8f
---[ end trace bd985b05cc906180 ]---
------------[ cut here ]------------
WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:9990
btrfs_free_block_groups+0x2a4/0x400 [btrfs]()
Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
pps_core
CPU: 5 PID: 26421 Comm: umount Tainted: G W O 4.4.15+43-ph #1
Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
0000000000000000 ffff880ae8b47cd8 ffffffffbd3c712f 0000000000000000
ffffffffc03ec603 ffff880ae8b47d18 ffffffffbd0837e7 ffff880c6aaa4528
0000000000000038 0000000000000000 ffff8802fe8d8c88 ffff8808881d2000
Call Trace:
[<ffffffffbd3c712f>] dump_stack+0x63/0x84
[<ffffffffbd0837e7>] warn_slowpath_common+0x97/0xe0
[<ffffffffbd08384a>] warn_slowpath_null+0x1a/0x20
[<ffffffffc034a0c4>] btrfs_free_block_groups+0x2a4/0x400 [btrfs]
[<ffffffffc035ba4b>] close_ctree+0x15b/0x330 [btrfs]
[<ffffffffc03291f9>] btrfs_put_super+0x19/0x20 [btrfs]
[<ffffffffbd1cd33f>] generic_shutdown_super+0x6f/0x100
[<ffffffffbd1cd866>] kill_anon_super+0x16/0x30
[<ffffffffc032f96a>] btrfs_kill_super+0x1a/0xb0 [btrfs]
[<ffffffffbd1cda31>] deactivate_locked_super+0x51/0x90
[<ffffffffbd1ce42e>] deactivate_super+0x4e/0x70
[<ffffffffbd1e9373>] cleanup_mnt+0x43/0x90
[<ffffffffbd1e9412>] __cleanup_mnt+0x12/0x20
[<ffffffffbd09ef8e>] task_work_run+0x7e/0xa0
[<ffffffffbd07e550>] exit_to_usermode_loop+0x66/0x95
[<ffffffffbd002a56>] syscall_return_slowpath+0xa6/0xf0
[<ffffffffbd6b6f4c>] int_ret_from_sys_call+0x25/0x8f
---[ end trace bd985b05cc906181 ]---
BTRFS: space_info 4 has 18446743491956604928 free, is not full
BTRFS: space_info total=307627032576, used=206629289984, pinned=0,
reserved=0, may_use=682750558208, readonly=131072
Greets,
Stefan
[1]
https://git.kernel.org/cgit/linux/kernel/git/kdave/linux.git/commit/?h=for-next&id=d555b6c380c644af63dbdaa7cc14bba041a4e4dd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-07-29 18:40 memory overflow or undeflow in free space tree / space_info? Stefan Priebe - Profihost AG
@ 2016-07-29 19:11 ` Omar Sandoval
2016-07-29 19:14 ` Omar Sandoval
2016-07-29 19:39 ` Stefan Priebe - Profihost AG
0 siblings, 2 replies; 14+ messages in thread
From: Omar Sandoval @ 2016-07-29 19:11 UTC (permalink / raw)
To: Stefan Priebe - Profihost AG; +Cc: linux-btrfs@vger.kernel.org, Josef Bacik
On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost AG wrote:
> Dear list,
>
> i'm seeing btrfs no space messages frequently on big filesystems (> 30TB).
>
> In all cases i'm getting a trace like this one a space_info warning.
> (since commit [1]). Could someone please be so kind and help me
> debugging / fixing this bug? I'm using space_cache=v2 on all those systems.
Hm, so I think this indicates a bug in space accounting somewhere else
rather than the free space tree itself. I haven't debugged one of these
issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
> ------------[ cut here ]------------
> WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5710
Do these line numbers match up with yours?
5706 static void release_global_block_rsv(struct btrfs_fs_info *fs_info)
5707 {
5708 block_rsv_release_bytes(fs_info, &fs_info->global_block_rsv, NULL,
5709 (u64)-1);
5710 WARN_ON(fs_info->delalloc_block_rsv.size > 0);
5711 WARN_ON(fs_info->delalloc_block_rsv.reserved > 0);
5712 WARN_ON(fs_info->trans_block_rsv.size > 0);
5713 WARN_ON(fs_info->trans_block_rsv.reserved > 0);
5714 WARN_ON(fs_info->chunk_block_rsv.size > 0);
5715 WARN_ON(fs_info->chunk_block_rsv.reserved > 0);
5716 WARN_ON(fs_info->delayed_block_rsv.size > 0);
5717 WARN_ON(fs_info->delayed_block_rsv.reserved > 0);
5718 }
> btrfs_free_block_groups+0x35a/0x400 [btrfs]()
> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
> raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
> x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
> usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
> usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
> raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
> xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
> pps_core
> CPU: 5 PID: 26421 Comm: umount Tainted: G W O 4.4.15+43-ph #1
> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
> 0000000000000000 ffff880ae8b47cd8 ffffffffbd3c712f 0000000000000000
> ffffffffc03ec603 ffff880ae8b47d18 ffffffffbd0837e7 00000047a0000000
> 0000000000000000 ffff8806016a1400 ffff8808881d2088 ffff8808881d2000
> Call Trace:
> [<ffffffffbd3c712f>] dump_stack+0x63/0x84
> [<ffffffffbd0837e7>] warn_slowpath_common+0x97/0xe0
> [<ffffffffbd08384a>] warn_slowpath_null+0x1a/0x20
> [<ffffffffc034a17a>] btrfs_free_block_groups+0x35a/0x400 [btrfs]
> [<ffffffffc035ba4b>] close_ctree+0x15b/0x330 [btrfs]
> [<ffffffffc03291f9>] btrfs_put_super+0x19/0x20 [btrfs]
> [<ffffffffbd1cd33f>] generic_shutdown_super+0x6f/0x100
> [<ffffffffbd1cd866>] kill_anon_super+0x16/0x30
> [<ffffffffc032f96a>] btrfs_kill_super+0x1a/0xb0 [btrfs]
> [<ffffffffbd1cda31>] deactivate_locked_super+0x51/0x90
> [<ffffffffbd1ce42e>] deactivate_super+0x4e/0x70
> [<ffffffffbd1e9373>] cleanup_mnt+0x43/0x90
> [<ffffffffbd1e9412>] __cleanup_mnt+0x12/0x20
> [<ffffffffbd09ef8e>] task_work_run+0x7e/0xa0
> [<ffffffffbd07e550>] exit_to_usermode_loop+0x66/0x95
> [<ffffffffbd002a56>] syscall_return_slowpath+0xa6/0xf0
> [<ffffffffbd6b6f4c>] int_ret_from_sys_call+0x25/0x8f
> ---[ end trace bd985b05cc90617f ]---
> ------------[ cut here ]------------
> WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5711
> btrfs_free_block_groups+0x3f4/0x400 [btrfs]()
> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
> raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
> x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
> usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
> usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
> raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
> xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
> pps_core
> CPU: 5 PID: 26421 Comm: umount Tainted: G W O 4.4.15+43-ph #1
> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
> 0000000000000000 ffff880ae8b47cd8 ffffffffbd3c712f 0000000000000000
> ffffffffc03ec603 ffff880ae8b47d18 ffffffffbd0837e7 00000047a0000000
> 0000000000000000 ffff8806016a1400 ffff8808881d2088 ffff8808881d2000
> Call Trace:
> [<ffffffffbd3c712f>] dump_stack+0x63/0x84
> [<ffffffffbd0837e7>] warn_slowpath_common+0x97/0xe0
> [<ffffffffbd08384a>] warn_slowpath_null+0x1a/0x20
> [<ffffffffc034a214>] btrfs_free_block_groups+0x3f4/0x400 [btrfs]
> [<ffffffffc035ba4b>] close_ctree+0x15b/0x330 [btrfs]
> [<ffffffffc03291f9>] btrfs_put_super+0x19/0x20 [btrfs]
> [<ffffffffbd1cd33f>] generic_shutdown_super+0x6f/0x100
> [<ffffffffbd1cd866>] kill_anon_super+0x16/0x30
> [<ffffffffc032f96a>] btrfs_kill_super+0x1a/0xb0 [btrfs]
> [<ffffffffbd1cda31>] deactivate_locked_super+0x51/0x90
> [<ffffffffbd1ce42e>] deactivate_super+0x4e/0x70
> [<ffffffffbd1e9373>] cleanup_mnt+0x43/0x90
> [<ffffffffbd1e9412>] __cleanup_mnt+0x12/0x20
> [<ffffffffbd09ef8e>] task_work_run+0x7e/0xa0
> [<ffffffffbd07e550>] exit_to_usermode_loop+0x66/0x95
> [<ffffffffbd002a56>] syscall_return_slowpath+0xa6/0xf0
> [<ffffffffbd6b6f4c>] int_ret_from_sys_call+0x25/0x8f
> ---[ end trace bd985b05cc906180 ]---
> ------------[ cut here ]------------
> WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:9990
I don't see what warning this is in kdave/for-next.
> btrfs_free_block_groups+0x2a4/0x400 [btrfs]()
> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
> raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
> x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
> usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
> usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
> raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
> xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
> pps_core
> CPU: 5 PID: 26421 Comm: umount Tainted: G W O 4.4.15+43-ph #1
> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
> 0000000000000000 ffff880ae8b47cd8 ffffffffbd3c712f 0000000000000000
> ffffffffc03ec603 ffff880ae8b47d18 ffffffffbd0837e7 ffff880c6aaa4528
> 0000000000000038 0000000000000000 ffff8802fe8d8c88 ffff8808881d2000
> Call Trace:
> [<ffffffffbd3c712f>] dump_stack+0x63/0x84
> [<ffffffffbd0837e7>] warn_slowpath_common+0x97/0xe0
> [<ffffffffbd08384a>] warn_slowpath_null+0x1a/0x20
> [<ffffffffc034a0c4>] btrfs_free_block_groups+0x2a4/0x400 [btrfs]
> [<ffffffffc035ba4b>] close_ctree+0x15b/0x330 [btrfs]
> [<ffffffffc03291f9>] btrfs_put_super+0x19/0x20 [btrfs]
> [<ffffffffbd1cd33f>] generic_shutdown_super+0x6f/0x100
> [<ffffffffbd1cd866>] kill_anon_super+0x16/0x30
> [<ffffffffc032f96a>] btrfs_kill_super+0x1a/0xb0 [btrfs]
> [<ffffffffbd1cda31>] deactivate_locked_super+0x51/0x90
> [<ffffffffbd1ce42e>] deactivate_super+0x4e/0x70
> [<ffffffffbd1e9373>] cleanup_mnt+0x43/0x90
> [<ffffffffbd1e9412>] __cleanup_mnt+0x12/0x20
> [<ffffffffbd09ef8e>] task_work_run+0x7e/0xa0
> [<ffffffffbd07e550>] exit_to_usermode_loop+0x66/0x95
> [<ffffffffbd002a56>] syscall_return_slowpath+0xa6/0xf0
> [<ffffffffbd6b6f4c>] int_ret_from_sys_call+0x25/0x8f
> ---[ end trace bd985b05cc906181 ]---
> BTRFS: space_info 4 has 18446743491956604928 free, is not full
> BTRFS: space_info total=307627032576, used=206629289984, pinned=0,
> reserved=0, may_use=682750558208, readonly=131072
>
> Greets,
> Stefan
>
> [1]
> https://git.kernel.org/cgit/linux/kernel/git/kdave/linux.git/commit/?h=for-next&id=d555b6c380c644af63dbdaa7cc14bba041a4e4dd
--
Omar
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-07-29 19:11 ` Omar Sandoval
@ 2016-07-29 19:14 ` Omar Sandoval
2016-07-29 19:40 ` Stefan Priebe - Profihost AG
2016-07-29 21:03 ` Josef Bacik
2016-07-29 19:39 ` Stefan Priebe - Profihost AG
1 sibling, 2 replies; 14+ messages in thread
From: Omar Sandoval @ 2016-07-29 19:14 UTC (permalink / raw)
To: Stefan Priebe - Profihost AG; +Cc: linux-btrfs@vger.kernel.org, Josef Bacik
On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost AG wrote:
> > Dear list,
> >
> > i'm seeing btrfs no space messages frequently on big filesystems (> 30TB).
> >
> > In all cases i'm getting a trace like this one a space_info warning.
> > (since commit [1]). Could someone please be so kind and help me
> > debugging / fixing this bug? I'm using space_cache=v2 on all those systems.
>
> Hm, so I think this indicates a bug in space accounting somewhere else
> rather than the free space tree itself. I haven't debugged one of these
> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
I should've asked, what sort of filesystem activity triggers this?
--
Omar
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-07-29 19:11 ` Omar Sandoval
2016-07-29 19:14 ` Omar Sandoval
@ 2016-07-29 19:39 ` Stefan Priebe - Profihost AG
1 sibling, 0 replies; 14+ messages in thread
From: Stefan Priebe - Profihost AG @ 2016-07-29 19:39 UTC (permalink / raw)
To: Omar Sandoval
Cc: linux-btrfs@vger.kernel.org, Josef Bacik, Holger Hoffstätte
Am 29.07.2016 um 21:11 schrieb Omar Sandoval:
> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost AG wrote:
>> Dear list,
>>
>> i'm seeing btrfs no space messages frequently on big filesystems (> 30TB).
>>
>> In all cases i'm getting a trace like this one a space_info warning.
>> (since commit [1]). Could someone please be so kind and help me
>> debugging / fixing this bug? I'm using space_cache=v2 on all those systems.
>
> Hm, so I think this indicates a bug in space accounting somewhere else
> rather than the free space tree itself. I haven't debugged one of these
> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
Thanks.
>> ------------[ cut here ]------------
>> WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5710
>
> Do these line numbers match up with yours?
>
> 5706 static void release_global_block_rsv(struct btrfs_fs_info *fs_info)
> 5707 {
> 5708 block_rsv_release_bytes(fs_info, &fs_info->global_block_rsv, NULL,
> 5709 (u64)-1);
> 5710 WARN_ON(fs_info->delalloc_block_rsv.size > 0);
> 5711 WARN_ON(fs_info->delalloc_block_rsv.reserved > 0);
> 5712 WARN_ON(fs_info->trans_block_rsv.size > 0);
> 5713 WARN_ON(fs_info->trans_block_rsv.reserved > 0);
> 5714 WARN_ON(fs_info->chunk_block_rsv.size > 0);
> 5715 WARN_ON(fs_info->chunk_block_rsv.reserved > 0);
> 5716 WARN_ON(fs_info->delayed_block_rsv.size > 0);
> 5717 WARN_ON(fs_info->delayed_block_rsv.reserved > 0);
> 5718 }
Yes it does.
But the kernel i'm using is somewhat special i'm using a 4.4 kernel with
a patchset from holger (CC'ed). See here:
https://github.com/hhoffstaette/kernel-patches/tree/c9cce0933a40db84627241143b123210aee0fde6/4.4.15
>> btrfs_free_block_groups+0x35a/0x400 [btrfs]()
>> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
>> raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
>> usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
>> usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
>> raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
>> xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
>> pps_core
>> CPU: 5 PID: 26421 Comm: umount Tainted: G W O 4.4.15+43-ph #1
>> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
>> 0000000000000000 ffff880ae8b47cd8 ffffffffbd3c712f 0000000000000000
>> ffffffffc03ec603 ffff880ae8b47d18 ffffffffbd0837e7 00000047a0000000
>> 0000000000000000 ffff8806016a1400 ffff8808881d2088 ffff8808881d2000
>> Call Trace:
>> [<ffffffffbd3c712f>] dump_stack+0x63/0x84
>> [<ffffffffbd0837e7>] warn_slowpath_common+0x97/0xe0
>> [<ffffffffbd08384a>] warn_slowpath_null+0x1a/0x20
>> [<ffffffffc034a17a>] btrfs_free_block_groups+0x35a/0x400 [btrfs]
>> [<ffffffffc035ba4b>] close_ctree+0x15b/0x330 [btrfs]
>> [<ffffffffc03291f9>] btrfs_put_super+0x19/0x20 [btrfs]
>> [<ffffffffbd1cd33f>] generic_shutdown_super+0x6f/0x100
>> [<ffffffffbd1cd866>] kill_anon_super+0x16/0x30
>> [<ffffffffc032f96a>] btrfs_kill_super+0x1a/0xb0 [btrfs]
>> [<ffffffffbd1cda31>] deactivate_locked_super+0x51/0x90
>> [<ffffffffbd1ce42e>] deactivate_super+0x4e/0x70
>> [<ffffffffbd1e9373>] cleanup_mnt+0x43/0x90
>> [<ffffffffbd1e9412>] __cleanup_mnt+0x12/0x20
>> [<ffffffffbd09ef8e>] task_work_run+0x7e/0xa0
>> [<ffffffffbd07e550>] exit_to_usermode_loop+0x66/0x95
>> [<ffffffffbd002a56>] syscall_return_slowpath+0xa6/0xf0
>> [<ffffffffbd6b6f4c>] int_ret_from_sys_call+0x25/0x8f
>> ---[ end trace bd985b05cc90617f ]---
>> ------------[ cut here ]------------
>> WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:5711
>> btrfs_free_block_groups+0x3f4/0x400 [btrfs]()
>> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
>> raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
>> usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
>> usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
>> raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
>> xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
>> pps_core
>> CPU: 5 PID: 26421 Comm: umount Tainted: G W O 4.4.15+43-ph #1
>> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
>> 0000000000000000 ffff880ae8b47cd8 ffffffffbd3c712f 0000000000000000
>> ffffffffc03ec603 ffff880ae8b47d18 ffffffffbd0837e7 00000047a0000000
>> 0000000000000000 ffff8806016a1400 ffff8808881d2088 ffff8808881d2000
>> Call Trace:
>> [<ffffffffbd3c712f>] dump_stack+0x63/0x84
>> [<ffffffffbd0837e7>] warn_slowpath_common+0x97/0xe0
>> [<ffffffffbd08384a>] warn_slowpath_null+0x1a/0x20
>> [<ffffffffc034a214>] btrfs_free_block_groups+0x3f4/0x400 [btrfs]
>> [<ffffffffc035ba4b>] close_ctree+0x15b/0x330 [btrfs]
>> [<ffffffffc03291f9>] btrfs_put_super+0x19/0x20 [btrfs]
>> [<ffffffffbd1cd33f>] generic_shutdown_super+0x6f/0x100
>> [<ffffffffbd1cd866>] kill_anon_super+0x16/0x30
>> [<ffffffffc032f96a>] btrfs_kill_super+0x1a/0xb0 [btrfs]
>> [<ffffffffbd1cda31>] deactivate_locked_super+0x51/0x90
>> [<ffffffffbd1ce42e>] deactivate_super+0x4e/0x70
>> [<ffffffffbd1e9373>] cleanup_mnt+0x43/0x90
>> [<ffffffffbd1e9412>] __cleanup_mnt+0x12/0x20
>> [<ffffffffbd09ef8e>] task_work_run+0x7e/0xa0
>> [<ffffffffbd07e550>] exit_to_usermode_loop+0x66/0x95
>> [<ffffffffbd002a56>] syscall_return_slowpath+0xa6/0xf0
>> [<ffffffffbd6b6f4c>] int_ret_from_sys_call+0x25/0x8f
>> ---[ end trace bd985b05cc906180 ]---
>> ------------[ cut here ]------------
>> WARNING: CPU: 5 PID: 26421 at fs/btrfs/extent-tree.c:9990
>
> I don't see what warning this is in kdave/for-next.
>
>> btrfs_free_block_groups+0x2a4/0x400 [btrfs]()
>> Modules linked in: netconsole ipt_REJECT nf_reject_ipv4 mpt3sas
>> raid_class scsi_transport_sas xt_multiport iptable_filter ip_tables
>> x_tables 8021q garp bonding coretemp loop i40e(O) vxlan ip6_udp_tunnel
>> usbhid udp_tunnel sb_edac ehci_pci edac_core ehci_hcd i2c_i801 i2c_core
>> usbcore shpchp usb_common ipmi_si ipmi_msghandler button btrfs dm_mod
>> raid1 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx
>> xor raid6_pq md_mod ixgbe mdio sg sd_mod ahci ptp libahci megaraid_sas
>> pps_core
>> CPU: 5 PID: 26421 Comm: umount Tainted: G W O 4.4.15+43-ph #1
>> Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c 02/18/2015
>> 0000000000000000 ffff880ae8b47cd8 ffffffffbd3c712f 0000000000000000
>> ffffffffc03ec603 ffff880ae8b47d18 ffffffffbd0837e7 ffff880c6aaa4528
>> 0000000000000038 0000000000000000 ffff8802fe8d8c88 ffff8808881d2000
>> Call Trace:
>> [<ffffffffbd3c712f>] dump_stack+0x63/0x84
>> [<ffffffffbd0837e7>] warn_slowpath_common+0x97/0xe0
>> [<ffffffffbd08384a>] warn_slowpath_null+0x1a/0x20
>> [<ffffffffc034a0c4>] btrfs_free_block_groups+0x2a4/0x400 [btrfs]
>> [<ffffffffc035ba4b>] close_ctree+0x15b/0x330 [btrfs]
>> [<ffffffffc03291f9>] btrfs_put_super+0x19/0x20 [btrfs]
>> [<ffffffffbd1cd33f>] generic_shutdown_super+0x6f/0x100
>> [<ffffffffbd1cd866>] kill_anon_super+0x16/0x30
>> [<ffffffffc032f96a>] btrfs_kill_super+0x1a/0xb0 [btrfs]
>> [<ffffffffbd1cda31>] deactivate_locked_super+0x51/0x90
>> [<ffffffffbd1ce42e>] deactivate_super+0x4e/0x70
>> [<ffffffffbd1e9373>] cleanup_mnt+0x43/0x90
>> [<ffffffffbd1e9412>] __cleanup_mnt+0x12/0x20
>> [<ffffffffbd09ef8e>] task_work_run+0x7e/0xa0
>> [<ffffffffbd07e550>] exit_to_usermode_loop+0x66/0x95
>> [<ffffffffbd002a56>] syscall_return_slowpath+0xa6/0xf0
>> [<ffffffffbd6b6f4c>] int_ret_from_sys_call+0x25/0x8f
>> ---[ end trace bd985b05cc906181 ]---
>> BTRFS: space_info 4 has 18446743491956604928 free, is not full
>> BTRFS: space_info total=307627032576, used=206629289984, pinned=0,
>> reserved=0, may_use=682750558208, readonly=131072
>>
>> Greets,
>> Stefan
>>
>> [1]
>> https://git.kernel.org/cgit/linux/kernel/git/kdave/linux.git/commit/?h=for-next&id=d555b6c380c644af63dbdaa7cc14bba041a4e4dd
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-07-29 19:14 ` Omar Sandoval
@ 2016-07-29 19:40 ` Stefan Priebe - Profihost AG
2016-07-29 21:03 ` Josef Bacik
1 sibling, 0 replies; 14+ messages in thread
From: Stefan Priebe - Profihost AG @ 2016-07-29 19:40 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-btrfs@vger.kernel.org, Josef Bacik
Am 29.07.2016 um 21:14 schrieb Omar Sandoval:
> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost AG wrote:
>>> Dear list,
>>>
>>> i'm seeing btrfs no space messages frequently on big filesystems (> 30TB).
>>>
>>> In all cases i'm getting a trace like this one a space_info warning.
>>> (since commit [1]). Could someone please be so kind and help me
>>> debugging / fixing this bug? I'm using space_cache=v2 on all those systems.
>>
>> Hm, so I think this indicates a bug in space accounting somewhere else
>> rather than the free space tree itself. I haven't debugged one of these
>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>
> I should've asked, what sort of filesystem activity triggers this?
>
Sure.
The workload on the FS is basically:
- Write file1 (50GB - 500GB)
- cp --reflink=always file1 to file2
- apply changes to file2 (100MB - 5GB)
- cp --reflink=always file2 to file3
- apply changes to file3 (100MB - 5GB)
...
- delete file1
- cp --reflink=always file3 to file4
- apply changes to file4 (100MB - 5GB)
- delete file2
...
And this for around 300 files a day. btrfs balance with dusage=5 and
musage=5 is running daily sometimes in parallel to the workload above.
Greets,
Stefan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-07-29 19:14 ` Omar Sandoval
2016-07-29 19:40 ` Stefan Priebe - Profihost AG
@ 2016-07-29 21:03 ` Josef Bacik
2016-07-29 22:57 ` Holger Hoffstätte
2016-08-04 11:40 ` Stefan Priebe - Profihost AG
1 sibling, 2 replies; 14+ messages in thread
From: Josef Bacik @ 2016-07-29 21:03 UTC (permalink / raw)
To: Omar Sandoval, Stefan Priebe - Profihost AG; +Cc: linux-btrfs@vger.kernel.org
On 07/29/2016 03:14 PM, Omar Sandoval wrote:
> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost AG wrote:
>>> Dear list,
>>>
>>> i'm seeing btrfs no space messages frequently on big filesystems (> 30TB).
>>>
>>> In all cases i'm getting a trace like this one a space_info warning.
>>> (since commit [1]). Could someone please be so kind and help me
>>> debugging / fixing this bug? I'm using space_cache=v2 on all those systems.
>>
>> Hm, so I think this indicates a bug in space accounting somewhere else
>> rather than the free space tree itself. I haven't debugged one of these
>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>
> I should've asked, what sort of filesystem activity triggers this?
>
Chris just fixed this I think, try his next branch from his git tree
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
and see if it still happens. Thanks,
Josef
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-07-29 21:03 ` Josef Bacik
@ 2016-07-29 22:57 ` Holger Hoffstätte
2016-07-29 23:09 ` Holger Hoffstätte
2016-08-04 11:40 ` Stefan Priebe - Profihost AG
1 sibling, 1 reply; 14+ messages in thread
From: Holger Hoffstätte @ 2016-07-29 22:57 UTC (permalink / raw)
To: linux-btrfs
On Fri, 29 Jul 2016 17:03:43 -0400, Josef Bacik wrote:
> On 07/29/2016 03:14 PM, Omar Sandoval wrote:
>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost AG wrote:
>>>> Dear list,
>>>>
>>>> i'm seeing btrfs no space messages frequently on big filesystems (> 30TB).
>>>>
>>>> In all cases i'm getting a trace like this one a space_info warning.
>>>> (since commit [1]). Could someone please be so kind and help me
>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those systems.
>>>
>>> Hm, so I think this indicates a bug in space accounting somewhere else
>>> rather than the free space tree itself. I haven't debugged one of these
>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>>
>> I should've asked, what sort of filesystem activity triggers this?
>>
>
> Chris just fixed this I think, try his next branch from his git tree
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
>
> and see if it still happens. Thanks,
>
> Josef
Hi Josef,
can you say which patch you have in mind? The tree in question
doesn't have any of Chandra's pagesize/sectorsize patches (carefully
patched around, for stability and LTS patchability) so I hope it's
not the recent commit
8b8b08cb "fix delalloc accounting after copy_from_user faults"
because that would be too fiddly (at least for me) to backport
correctly.
The only other patch I just found missing and which looks like it
could/should (I think?) work on top of the 4.4.x pagesize-based
calculations in file.c is:
a2af23b7 "__btrfs_buffered_write: Pass valid file offset when
releasing delalloc space"
Would that make sense? Neither I nor any other users of that tree
have observed weird space-info underflows so far (and I use my
fs daily), so it's definitely something peculiar Stefan is doing
with his weird compressed rsync-inplace workload. Odd sector offsets
causing slowly creeping space_info underflow sounds to me like it
just might be the problem.
thanks,
Holger
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-07-29 22:57 ` Holger Hoffstätte
@ 2016-07-29 23:09 ` Holger Hoffstätte
0 siblings, 0 replies; 14+ messages in thread
From: Holger Hoffstätte @ 2016-07-29 23:09 UTC (permalink / raw)
To: linux-btrfs
On Fri, 29 Jul 2016 22:57:36 +0000, Holger Hoffstätte wrote:
> The only other patch I just found missing and which looks like it
> could/should (I think?) work on top of the 4.4.x pagesize-based
> calculations in file.c is:
>
> a2af23b7 "__btrfs_buffered_write: Pass valid file offset when
> releasing delalloc space"
>
> Would that make sense?
No it wouldn't, not without some other sectorsize-related patches
that came before...and those would just make matters worse.
So forget the above.
-h
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-07-29 21:03 ` Josef Bacik
2016-07-29 22:57 ` Holger Hoffstätte
@ 2016-08-04 11:40 ` Stefan Priebe - Profihost AG
2016-08-08 6:17 ` Stefan Priebe - Profihost AG
1 sibling, 1 reply; 14+ messages in thread
From: Stefan Priebe - Profihost AG @ 2016-08-04 11:40 UTC (permalink / raw)
To: Josef Bacik, Omar Sandoval; +Cc: linux-btrfs@vger.kernel.org
Am 29.07.2016 um 23:03 schrieb Josef Bacik:
> On 07/29/2016 03:14 PM, Omar Sandoval wrote:
>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost
>>> AG wrote:
>>>> Dear list,
>>>>
>>>> i'm seeing btrfs no space messages frequently on big filesystems (>
>>>> 30TB).
>>>>
>>>> In all cases i'm getting a trace like this one a space_info warning.
>>>> (since commit [1]). Could someone please be so kind and help me
>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those
>>>> systems.
>>>
>>> Hm, so I think this indicates a bug in space accounting somewhere else
>>> rather than the free space tree itself. I haven't debugged one of these
>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>>
>> I should've asked, what sort of filesystem activity triggers this?
>>
>
> Chris just fixed this I think, try his next branch from his git tree
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
Thanks now running a 4.4 with those patches backported. If that still
shows an error i will try that vanilla tree.
Thanks!
Stefan
> and see if it still happens. Thanks,
>
> Josef
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-08-04 11:40 ` Stefan Priebe - Profihost AG
@ 2016-08-08 6:17 ` Stefan Priebe - Profihost AG
2016-08-10 21:31 ` Stefan Priebe - Profihost AG
0 siblings, 1 reply; 14+ messages in thread
From: Stefan Priebe - Profihost AG @ 2016-08-08 6:17 UTC (permalink / raw)
To: Josef Bacik, Omar Sandoval; +Cc: linux-btrfs@vger.kernel.org
Am 04.08.2016 um 13:40 schrieb Stefan Priebe - Profihost AG:
> Am 29.07.2016 um 23:03 schrieb Josef Bacik:
>> On 07/29/2016 03:14 PM, Omar Sandoval wrote:
>>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost
>>>> AG wrote:
>>>>> Dear list,
>>>>>
>>>>> i'm seeing btrfs no space messages frequently on big filesystems (>
>>>>> 30TB).
>>>>>
>>>>> In all cases i'm getting a trace like this one a space_info warning.
>>>>> (since commit [1]). Could someone please be so kind and help me
>>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those
>>>>> systems.
>>>>
>>>> Hm, so I think this indicates a bug in space accounting somewhere else
>>>> rather than the free space tree itself. I haven't debugged one of these
>>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>>>
>>> I should've asked, what sort of filesystem activity triggers this?
>>>
>>
>> Chris just fixed this I think, try his next branch from his git tree
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
>
> Thanks now running a 4.4 with those patches backported. If that still
> shows an error i will try that vanilla tree.
OK this didn't work. I'll start / try using the linux-btrfs next branch
and look if this helps.
Greets,
Stefan
>
> Thanks!
>
> Stefan
>
>> and see if it still happens. Thanks,
>>
>> Josef
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-08-08 6:17 ` Stefan Priebe - Profihost AG
@ 2016-08-10 21:31 ` Stefan Priebe - Profihost AG
2016-08-11 6:09 ` Stefan Priebe - Profihost AG
0 siblings, 1 reply; 14+ messages in thread
From: Stefan Priebe - Profihost AG @ 2016-08-10 21:31 UTC (permalink / raw)
To: Josef Bacik, Omar Sandoval; +Cc: linux-btrfs@vger.kernel.org
Hi Josef,
same again with chris next branch:
ERROR: error during balancing '/vmbackup/': No space left on device
There may be more info in syslog - try dmesg | tail
Dumping filters: flags 0x7, state 0x0, force is off
DATA (flags 0x2): balancing, usage=5
METADATA (flags 0x2): balancing, usage=5
SYSTEM (flags 0x2): balancing, usage=5
dmesg:
[203784.411189] BTRFS info (device dm-0): 114 enospc errors during balance
uname -r 4.7.0-rc6-29043-g8b8b08c
Greets,
Stefan
Am 08.08.2016 um 08:17 schrieb Stefan Priebe - Profihost AG:
> Am 04.08.2016 um 13:40 schrieb Stefan Priebe - Profihost AG:
>> Am 29.07.2016 um 23:03 schrieb Josef Bacik:
>>> On 07/29/2016 03:14 PM, Omar Sandoval wrote:
>>>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>>>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost
>>>>> AG wrote:
>>>>>> Dear list,
>>>>>>
>>>>>> i'm seeing btrfs no space messages frequently on big filesystems (>
>>>>>> 30TB).
>>>>>>
>>>>>> In all cases i'm getting a trace like this one a space_info warning.
>>>>>> (since commit [1]). Could someone please be so kind and help me
>>>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those
>>>>>> systems.
>>>>>
>>>>> Hm, so I think this indicates a bug in space accounting somewhere else
>>>>> rather than the free space tree itself. I haven't debugged one of these
>>>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>>>>
>>>> I should've asked, what sort of filesystem activity triggers this?
>>>>
>>>
>>> Chris just fixed this I think, try his next branch from his git tree
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
>>
>> Thanks now running a 4.4 with those patches backported. If that still
>> shows an error i will try that vanilla tree.
>
> OK this didn't work. I'll start / try using the linux-btrfs next branch
> and look if this helps.
>
> Greets,
> Stefan
>
>>
>> Thanks!
>>
>> Stefan
>>
>>> and see if it still happens. Thanks,
>>>
>>> Josef
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-08-10 21:31 ` Stefan Priebe - Profihost AG
@ 2016-08-11 6:09 ` Stefan Priebe - Profihost AG
2016-08-14 15:22 ` Stefan Priebe - Profihost AG
0 siblings, 1 reply; 14+ messages in thread
From: Stefan Priebe - Profihost AG @ 2016-08-11 6:09 UTC (permalink / raw)
To: Josef Bacik, Omar Sandoval; +Cc: linux-btrfs@vger.kernel.org
Hello,
the backtrace and info on umount looks the same:
[241910.341124] ------------[ cut here ]------------
[241910.379991] WARNING: CPU: 1 PID: 26664 at
fs/btrfs/extent-tree.c:5701 btrfs_free_block_groups+0x370/0x410 [btrfs]
[241910.422099] Modules linked in: netconsole mpt3sas ipt_REJECT
raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
[241910.616845] CPU: 1 PID: 26664 Comm: umount Not tainted
4.7.0-rc6-29043-g8b8b08c #1
[241910.669646] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
02/18/2015
[241910.723716] 0000000000000000 ffff8808d104bca8 ffffffffbd3d83cf
0000000000000000
[241910.779309] 0000000000000000 ffff8808d104bcf8 ffffffffbd085615
ffff8808d104bd08
[241910.835143] 000016455a3410a8 00000047a0000000 0000000000000000
ffff8808469e2088
[241910.891882] Call Trace:
[241910.947624] [<ffffffffbd3d83cf>] dump_stack+0x63/0x84
[241911.003714] [<ffffffffbd085615>] __warn+0xe5/0x100
[241911.060167] [<ffffffffbd08564d>] warn_slowpath_null+0x1d/0x20
[241911.117422] [<ffffffffc058ca90>]
btrfs_free_block_groups+0x370/0x410 [btrfs]
[241911.175975] [<ffffffffc059e7ab>] close_ctree+0x15b/0x330 [btrfs]
[241911.235170] [<ffffffffc056f089>] btrfs_put_super+0x19/0x20 [btrfs]
[241911.294638] [<ffffffffbd1deaff>] generic_shutdown_super+0x6f/0x100
[241911.353005] [<ffffffffbd1df026>] kill_anon_super+0x16/0x30
[241911.409832] [<ffffffffc05720fa>] btrfs_kill_super+0x1a/0xb0 [btrfs]
[241911.466467] [<ffffffffbd1df1f1>] deactivate_locked_super+0x51/0x90
[241911.522602] [<ffffffffbd1dfb8e>] deactivate_super+0x4e/0x70
[241911.577979] [<ffffffffbd1fba73>] cleanup_mnt+0x43/0x90
[241911.633188] [<ffffffffbd1fbb12>] __cleanup_mnt+0x12/0x20
[241911.688146] [<ffffffffbd0a1f61>] task_work_run+0x81/0xb0
[241911.742740] [<ffffffffbd07ffcd>] exit_to_usermode_loop+0x66/0x95
[241911.797039] [<ffffffffbd002a7d>] do_syscall_64+0x10d/0x150
[241911.850750] [<ffffffffbd6d9ca1>] entry_SYSCALL64_slow_path+0x25/0x25
[241911.903564] ---[ end trace fae017546778f2b0 ]---
[241911.955332] ------------[ cut here ]------------
[241912.006262] WARNING: CPU: 1 PID: 26664 at
fs/btrfs/extent-tree.c:5702 btrfs_free_block_groups+0x40a/0x410 [btrfs]
[241912.059326] Modules linked in: netconsole mpt3sas ipt_REJECT
raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
[241912.298666] CPU: 1 PID: 26664 Comm: umount Tainted: G W
4.7.0-rc6-29043-g8b8b08c #1
[241912.363401] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
02/18/2015
[241912.429395] 0000000000000000 ffff8808d104bca8 ffffffffbd3d83cf
0000000000000000
[241912.497080] 0000000000000000 ffff8808d104bcf8 ffffffffbd085615
ffff8808d104bd08
[241912.565113] 000016465a3410a8 00000047a0000000 0000000000000000
ffff8808469e2088
[241912.634105] Call Trace:
[241912.702992] [<ffffffffbd3d83cf>] dump_stack+0x63/0x84
[241912.773473] [<ffffffffbd085615>] __warn+0xe5/0x100
[241912.844339] [<ffffffffbd08564d>] warn_slowpath_null+0x1d/0x20
[241912.916083] [<ffffffffc058cb2a>]
btrfs_free_block_groups+0x40a/0x410 [btrfs]
[241912.989103] [<ffffffffc059e7ab>] close_ctree+0x15b/0x330 [btrfs]
[241913.062672] [<ffffffffc056f089>] btrfs_put_super+0x19/0x20 [btrfs]
[241913.136364] [<ffffffffbd1deaff>] generic_shutdown_super+0x6f/0x100
[241913.208701] [<ffffffffbd1df026>] kill_anon_super+0x16/0x30
[241913.279194] [<ffffffffc05720fa>] btrfs_kill_super+0x1a/0xb0 [btrfs]
[241913.348065] [<ffffffffbd1df1f1>] deactivate_locked_super+0x51/0x90
[241913.415082] [<ffffffffbd1dfb8e>] deactivate_super+0x4e/0x70
[241913.479841] [<ffffffffbd1fba73>] cleanup_mnt+0x43/0x90
[241913.543353] [<ffffffffbd1fbb12>] __cleanup_mnt+0x12/0x20
[241913.605959] [<ffffffffbd0a1f61>] task_work_run+0x81/0xb0
[241913.667542] [<ffffffffbd07ffcd>] exit_to_usermode_loop+0x66/0x95
[241913.729612] [<ffffffffbd002a7d>] do_syscall_64+0x10d/0x150
[241913.791203] [<ffffffffbd6d9ca1>] entry_SYSCALL64_slow_path+0x25/0x25
[241913.852485] ---[ end trace fae017546778f2b1 ]---
[241913.913638] ------------[ cut here ]------------
[241913.974871] WARNING: CPU: 1 PID: 26664 at
fs/btrfs/extent-tree.c:10013 btrfs_free_block_groups+0x2ba/0x410 [btrfs]
[241914.039315] Modules linked in: netconsole mpt3sas ipt_REJECT
raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
[241914.315918] CPU: 1 PID: 26664 Comm: umount Tainted: G W
4.7.0-rc6-29043-g8b8b08c #1
[241914.388096] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
02/18/2015
[241914.460679] 0000000000000000 ffff8808d104bca8 ffffffffbd3d83cf
0000000000000000
[241914.534126] 0000000000000000 ffff8808d104bcf8 ffffffffbd085615
ffff8808d104bce8
[241914.607523] 0000271dbd3dac8c ffff88085184aac8 0000000000000038
0000000000000000
[241914.681318] Call Trace:
[241914.754437] [<ffffffffbd3d83cf>] dump_stack+0x63/0x84
[241914.828796] [<ffffffffbd085615>] __warn+0xe5/0x100
[241914.902953] [<ffffffffbd08564d>] warn_slowpath_null+0x1d/0x20
[241914.977271] [<ffffffffc058c9da>]
btrfs_free_block_groups+0x2ba/0x410 [btrfs]
[241915.052041] [<ffffffffc059e7ab>] close_ctree+0x15b/0x330 [btrfs]
[241915.126282] [<ffffffffc056f089>] btrfs_put_super+0x19/0x20 [btrfs]
[241915.200758] [<ffffffffbd1deaff>] generic_shutdown_super+0x6f/0x100
[241915.273872] [<ffffffffbd1df026>] kill_anon_super+0x16/0x30
[241915.345132] [<ffffffffc05720fa>] btrfs_kill_super+0x1a/0xb0 [btrfs]
[241915.414703] [<ffffffffbd1df1f1>] deactivate_locked_super+0x51/0x90
[241915.482488] [<ffffffffbd1dfb8e>] deactivate_super+0x4e/0x70
[241915.547994] [<ffffffffbd1fba73>] cleanup_mnt+0x43/0x90
[241915.611962] [<ffffffffbd1fbb12>] __cleanup_mnt+0x12/0x20
[241915.674717] [<ffffffffbd0a1f61>] task_work_run+0x81/0xb0
[241915.736398] [<ffffffffbd07ffcd>] exit_to_usermode_loop+0x66/0x95
[241915.798592] [<ffffffffbd002a7d>] do_syscall_64+0x10d/0x150
[241915.860295] [<ffffffffbd6d9ca1>] entry_SYSCALL64_slow_path+0x25/0x25
[241915.921642] ---[ end trace fae017546778f2b2 ]---
[241915.982893] BTRFS: space_info 4 has 114577997824 free, is not full
[241916.045103] BTRFS: space_info total=307627032576, used=193048903680,
pinned=0, reserved=0, may_use=688537059328, readonly=131072
Greets,
Stefan
Am 10.08.2016 um 23:31 schrieb Stefan Priebe - Profihost AG:
> Hi Josef,
>
> same again with chris next branch:
>
> ERROR: error during balancing '/vmbackup/': No space left on device
> There may be more info in syslog - try dmesg | tail
> Dumping filters: flags 0x7, state 0x0, force is off
> DATA (flags 0x2): balancing, usage=5
> METADATA (flags 0x2): balancing, usage=5
> SYSTEM (flags 0x2): balancing, usage=5
>
> dmesg:
> [203784.411189] BTRFS info (device dm-0): 114 enospc errors during balance
>
> uname -r 4.7.0-rc6-29043-g8b8b08c
>
> Greets,
> Stefan
>
> Am 08.08.2016 um 08:17 schrieb Stefan Priebe - Profihost AG:
>> Am 04.08.2016 um 13:40 schrieb Stefan Priebe - Profihost AG:
>>> Am 29.07.2016 um 23:03 schrieb Josef Bacik:
>>>> On 07/29/2016 03:14 PM, Omar Sandoval wrote:
>>>>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>>>>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost
>>>>>> AG wrote:
>>>>>>> Dear list,
>>>>>>>
>>>>>>> i'm seeing btrfs no space messages frequently on big filesystems (>
>>>>>>> 30TB).
>>>>>>>
>>>>>>> In all cases i'm getting a trace like this one a space_info warning.
>>>>>>> (since commit [1]). Could someone please be so kind and help me
>>>>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those
>>>>>>> systems.
>>>>>>
>>>>>> Hm, so I think this indicates a bug in space accounting somewhere else
>>>>>> rather than the free space tree itself. I haven't debugged one of these
>>>>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>>>>>
>>>>> I should've asked, what sort of filesystem activity triggers this?
>>>>>
>>>>
>>>> Chris just fixed this I think, try his next branch from his git tree
>>>>
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
>>>
>>> Thanks now running a 4.4 with those patches backported. If that still
>>> shows an error i will try that vanilla tree.
>>
>> OK this didn't work. I'll start / try using the linux-btrfs next branch
>> and look if this helps.
>>
>> Greets,
>> Stefan
>>
>>>
>>> Thanks!
>>>
>>> Stefan
>>>
>>>> and see if it still happens. Thanks,
>>>>
>>>> Josef
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-08-11 6:09 ` Stefan Priebe - Profihost AG
@ 2016-08-14 15:22 ` Stefan Priebe - Profihost AG
2016-08-29 14:02 ` Stefan Priebe - Profihost AG
0 siblings, 1 reply; 14+ messages in thread
From: Stefan Priebe - Profihost AG @ 2016-08-14 15:22 UTC (permalink / raw)
To: Josef Bacik, Omar Sandoval; +Cc: linux-btrfs@vger.kernel.org
Hi Josef,
anything i could do or test? Results with a vanilla next branch are the
same.
Stefan
Am 11.08.2016 um 08:09 schrieb Stefan Priebe - Profihost AG:
> Hello,
>
> the backtrace and info on umount looks the same:
>
> [241910.341124] ------------[ cut here ]------------
> [241910.379991] WARNING: CPU: 1 PID: 26664 at
> fs/btrfs/extent-tree.c:5701 btrfs_free_block_groups+0x370/0x410 [btrfs]
> [241910.422099] Modules linked in: netconsole mpt3sas ipt_REJECT
> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
> [241910.616845] CPU: 1 PID: 26664 Comm: umount Not tainted
> 4.7.0-rc6-29043-g8b8b08c #1
> [241910.669646] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
> 02/18/2015
> [241910.723716] 0000000000000000 ffff8808d104bca8 ffffffffbd3d83cf
> 0000000000000000
> [241910.779309] 0000000000000000 ffff8808d104bcf8 ffffffffbd085615
> ffff8808d104bd08
> [241910.835143] 000016455a3410a8 00000047a0000000 0000000000000000
> ffff8808469e2088
> [241910.891882] Call Trace:
> [241910.947624] [<ffffffffbd3d83cf>] dump_stack+0x63/0x84
> [241911.003714] [<ffffffffbd085615>] __warn+0xe5/0x100
> [241911.060167] [<ffffffffbd08564d>] warn_slowpath_null+0x1d/0x20
> [241911.117422] [<ffffffffc058ca90>]
> btrfs_free_block_groups+0x370/0x410 [btrfs]
> [241911.175975] [<ffffffffc059e7ab>] close_ctree+0x15b/0x330 [btrfs]
> [241911.235170] [<ffffffffc056f089>] btrfs_put_super+0x19/0x20 [btrfs]
> [241911.294638] [<ffffffffbd1deaff>] generic_shutdown_super+0x6f/0x100
> [241911.353005] [<ffffffffbd1df026>] kill_anon_super+0x16/0x30
> [241911.409832] [<ffffffffc05720fa>] btrfs_kill_super+0x1a/0xb0 [btrfs]
> [241911.466467] [<ffffffffbd1df1f1>] deactivate_locked_super+0x51/0x90
> [241911.522602] [<ffffffffbd1dfb8e>] deactivate_super+0x4e/0x70
> [241911.577979] [<ffffffffbd1fba73>] cleanup_mnt+0x43/0x90
> [241911.633188] [<ffffffffbd1fbb12>] __cleanup_mnt+0x12/0x20
> [241911.688146] [<ffffffffbd0a1f61>] task_work_run+0x81/0xb0
> [241911.742740] [<ffffffffbd07ffcd>] exit_to_usermode_loop+0x66/0x95
> [241911.797039] [<ffffffffbd002a7d>] do_syscall_64+0x10d/0x150
> [241911.850750] [<ffffffffbd6d9ca1>] entry_SYSCALL64_slow_path+0x25/0x25
> [241911.903564] ---[ end trace fae017546778f2b0 ]---
> [241911.955332] ------------[ cut here ]------------
> [241912.006262] WARNING: CPU: 1 PID: 26664 at
> fs/btrfs/extent-tree.c:5702 btrfs_free_block_groups+0x40a/0x410 [btrfs]
> [241912.059326] Modules linked in: netconsole mpt3sas ipt_REJECT
> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
> [241912.298666] CPU: 1 PID: 26664 Comm: umount Tainted: G W
> 4.7.0-rc6-29043-g8b8b08c #1
> [241912.363401] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
> 02/18/2015
> [241912.429395] 0000000000000000 ffff8808d104bca8 ffffffffbd3d83cf
> 0000000000000000
> [241912.497080] 0000000000000000 ffff8808d104bcf8 ffffffffbd085615
> ffff8808d104bd08
> [241912.565113] 000016465a3410a8 00000047a0000000 0000000000000000
> ffff8808469e2088
> [241912.634105] Call Trace:
> [241912.702992] [<ffffffffbd3d83cf>] dump_stack+0x63/0x84
> [241912.773473] [<ffffffffbd085615>] __warn+0xe5/0x100
> [241912.844339] [<ffffffffbd08564d>] warn_slowpath_null+0x1d/0x20
> [241912.916083] [<ffffffffc058cb2a>]
> btrfs_free_block_groups+0x40a/0x410 [btrfs]
> [241912.989103] [<ffffffffc059e7ab>] close_ctree+0x15b/0x330 [btrfs]
> [241913.062672] [<ffffffffc056f089>] btrfs_put_super+0x19/0x20 [btrfs]
> [241913.136364] [<ffffffffbd1deaff>] generic_shutdown_super+0x6f/0x100
> [241913.208701] [<ffffffffbd1df026>] kill_anon_super+0x16/0x30
> [241913.279194] [<ffffffffc05720fa>] btrfs_kill_super+0x1a/0xb0 [btrfs]
> [241913.348065] [<ffffffffbd1df1f1>] deactivate_locked_super+0x51/0x90
> [241913.415082] [<ffffffffbd1dfb8e>] deactivate_super+0x4e/0x70
> [241913.479841] [<ffffffffbd1fba73>] cleanup_mnt+0x43/0x90
> [241913.543353] [<ffffffffbd1fbb12>] __cleanup_mnt+0x12/0x20
> [241913.605959] [<ffffffffbd0a1f61>] task_work_run+0x81/0xb0
> [241913.667542] [<ffffffffbd07ffcd>] exit_to_usermode_loop+0x66/0x95
> [241913.729612] [<ffffffffbd002a7d>] do_syscall_64+0x10d/0x150
> [241913.791203] [<ffffffffbd6d9ca1>] entry_SYSCALL64_slow_path+0x25/0x25
> [241913.852485] ---[ end trace fae017546778f2b1 ]---
> [241913.913638] ------------[ cut here ]------------
> [241913.974871] WARNING: CPU: 1 PID: 26664 at
> fs/btrfs/extent-tree.c:10013 btrfs_free_block_groups+0x2ba/0x410 [btrfs]
> [241914.039315] Modules linked in: netconsole mpt3sas ipt_REJECT
> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
> [241914.315918] CPU: 1 PID: 26664 Comm: umount Tainted: G W
> 4.7.0-rc6-29043-g8b8b08c #1
> [241914.388096] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
> 02/18/2015
> [241914.460679] 0000000000000000 ffff8808d104bca8 ffffffffbd3d83cf
> 0000000000000000
> [241914.534126] 0000000000000000 ffff8808d104bcf8 ffffffffbd085615
> ffff8808d104bce8
> [241914.607523] 0000271dbd3dac8c ffff88085184aac8 0000000000000038
> 0000000000000000
> [241914.681318] Call Trace:
> [241914.754437] [<ffffffffbd3d83cf>] dump_stack+0x63/0x84
> [241914.828796] [<ffffffffbd085615>] __warn+0xe5/0x100
> [241914.902953] [<ffffffffbd08564d>] warn_slowpath_null+0x1d/0x20
> [241914.977271] [<ffffffffc058c9da>]
> btrfs_free_block_groups+0x2ba/0x410 [btrfs]
> [241915.052041] [<ffffffffc059e7ab>] close_ctree+0x15b/0x330 [btrfs]
> [241915.126282] [<ffffffffc056f089>] btrfs_put_super+0x19/0x20 [btrfs]
> [241915.200758] [<ffffffffbd1deaff>] generic_shutdown_super+0x6f/0x100
> [241915.273872] [<ffffffffbd1df026>] kill_anon_super+0x16/0x30
> [241915.345132] [<ffffffffc05720fa>] btrfs_kill_super+0x1a/0xb0 [btrfs]
> [241915.414703] [<ffffffffbd1df1f1>] deactivate_locked_super+0x51/0x90
> [241915.482488] [<ffffffffbd1dfb8e>] deactivate_super+0x4e/0x70
> [241915.547994] [<ffffffffbd1fba73>] cleanup_mnt+0x43/0x90
> [241915.611962] [<ffffffffbd1fbb12>] __cleanup_mnt+0x12/0x20
> [241915.674717] [<ffffffffbd0a1f61>] task_work_run+0x81/0xb0
> [241915.736398] [<ffffffffbd07ffcd>] exit_to_usermode_loop+0x66/0x95
> [241915.798592] [<ffffffffbd002a7d>] do_syscall_64+0x10d/0x150
> [241915.860295] [<ffffffffbd6d9ca1>] entry_SYSCALL64_slow_path+0x25/0x25
> [241915.921642] ---[ end trace fae017546778f2b2 ]---
> [241915.982893] BTRFS: space_info 4 has 114577997824 free, is not full
> [241916.045103] BTRFS: space_info total=307627032576, used=193048903680,
> pinned=0, reserved=0, may_use=688537059328, readonly=131072
>
> Greets,
> Stefan
>
> Am 10.08.2016 um 23:31 schrieb Stefan Priebe - Profihost AG:
>> Hi Josef,
>>
>> same again with chris next branch:
>>
>> ERROR: error during balancing '/vmbackup/': No space left on device
>> There may be more info in syslog - try dmesg | tail
>> Dumping filters: flags 0x7, state 0x0, force is off
>> DATA (flags 0x2): balancing, usage=5
>> METADATA (flags 0x2): balancing, usage=5
>> SYSTEM (flags 0x2): balancing, usage=5
>>
>> dmesg:
>> [203784.411189] BTRFS info (device dm-0): 114 enospc errors during balance
>>
>> uname -r 4.7.0-rc6-29043-g8b8b08c
>>
>> Greets,
>> Stefan
>>
>> Am 08.08.2016 um 08:17 schrieb Stefan Priebe - Profihost AG:
>>> Am 04.08.2016 um 13:40 schrieb Stefan Priebe - Profihost AG:
>>>> Am 29.07.2016 um 23:03 schrieb Josef Bacik:
>>>>> On 07/29/2016 03:14 PM, Omar Sandoval wrote:
>>>>>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>>>>>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost
>>>>>>> AG wrote:
>>>>>>>> Dear list,
>>>>>>>>
>>>>>>>> i'm seeing btrfs no space messages frequently on big filesystems (>
>>>>>>>> 30TB).
>>>>>>>>
>>>>>>>> In all cases i'm getting a trace like this one a space_info warning.
>>>>>>>> (since commit [1]). Could someone please be so kind and help me
>>>>>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those
>>>>>>>> systems.
>>>>>>>
>>>>>>> Hm, so I think this indicates a bug in space accounting somewhere else
>>>>>>> rather than the free space tree itself. I haven't debugged one of these
>>>>>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>>>>>>
>>>>>> I should've asked, what sort of filesystem activity triggers this?
>>>>>>
>>>>>
>>>>> Chris just fixed this I think, try his next branch from his git tree
>>>>>
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
>>>>
>>>> Thanks now running a 4.4 with those patches backported. If that still
>>>> shows an error i will try that vanilla tree.
>>>
>>> OK this didn't work. I'll start / try using the linux-btrfs next branch
>>> and look if this helps.
>>>
>>> Greets,
>>> Stefan
>>>
>>>>
>>>> Thanks!
>>>>
>>>> Stefan
>>>>
>>>>> and see if it still happens. Thanks,
>>>>>
>>>>> Josef
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: memory overflow or undeflow in free space tree / space_info?
2016-08-14 15:22 ` Stefan Priebe - Profihost AG
@ 2016-08-29 14:02 ` Stefan Priebe - Profihost AG
0 siblings, 0 replies; 14+ messages in thread
From: Stefan Priebe - Profihost AG @ 2016-08-29 14:02 UTC (permalink / raw)
To: Josef Bacik, Omar Sandoval; +Cc: linux-btrfs@vger.kernel.org
Hi Josef,
this still hapens with current 4.8-rc* releases. Anything i can do to
debug this? May be insert some code to check for an under or overflow in
the code?
Stefan
Am 14.08.2016 um 17:22 schrieb Stefan Priebe - Profihost AG:
> Hi Josef,
>
> anything i could do or test? Results with a vanilla next branch are the
> same.
>
> Stefan
>
> Am 11.08.2016 um 08:09 schrieb Stefan Priebe - Profihost AG:
>> Hello,
>>
>> the backtrace and info on umount looks the same:
>>
>> [241910.341124] ------------[ cut here ]------------
>> [241910.379991] WARNING: CPU: 1 PID: 26664 at
>> fs/btrfs/extent-tree.c:5701 btrfs_free_block_groups+0x370/0x410 [btrfs]
>> [241910.422099] Modules linked in: netconsole mpt3sas ipt_REJECT
>> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
>> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
>> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
>> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
>> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
>> [241910.616845] CPU: 1 PID: 26664 Comm: umount Not tainted
>> 4.7.0-rc6-29043-g8b8b08c #1
>> [241910.669646] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
>> 02/18/2015
>> [241910.723716] 0000000000000000 ffff8808d104bca8 ffffffffbd3d83cf
>> 0000000000000000
>> [241910.779309] 0000000000000000 ffff8808d104bcf8 ffffffffbd085615
>> ffff8808d104bd08
>> [241910.835143] 000016455a3410a8 00000047a0000000 0000000000000000
>> ffff8808469e2088
>> [241910.891882] Call Trace:
>> [241910.947624] [<ffffffffbd3d83cf>] dump_stack+0x63/0x84
>> [241911.003714] [<ffffffffbd085615>] __warn+0xe5/0x100
>> [241911.060167] [<ffffffffbd08564d>] warn_slowpath_null+0x1d/0x20
>> [241911.117422] [<ffffffffc058ca90>]
>> btrfs_free_block_groups+0x370/0x410 [btrfs]
>> [241911.175975] [<ffffffffc059e7ab>] close_ctree+0x15b/0x330 [btrfs]
>> [241911.235170] [<ffffffffc056f089>] btrfs_put_super+0x19/0x20 [btrfs]
>> [241911.294638] [<ffffffffbd1deaff>] generic_shutdown_super+0x6f/0x100
>> [241911.353005] [<ffffffffbd1df026>] kill_anon_super+0x16/0x30
>> [241911.409832] [<ffffffffc05720fa>] btrfs_kill_super+0x1a/0xb0 [btrfs]
>> [241911.466467] [<ffffffffbd1df1f1>] deactivate_locked_super+0x51/0x90
>> [241911.522602] [<ffffffffbd1dfb8e>] deactivate_super+0x4e/0x70
>> [241911.577979] [<ffffffffbd1fba73>] cleanup_mnt+0x43/0x90
>> [241911.633188] [<ffffffffbd1fbb12>] __cleanup_mnt+0x12/0x20
>> [241911.688146] [<ffffffffbd0a1f61>] task_work_run+0x81/0xb0
>> [241911.742740] [<ffffffffbd07ffcd>] exit_to_usermode_loop+0x66/0x95
>> [241911.797039] [<ffffffffbd002a7d>] do_syscall_64+0x10d/0x150
>> [241911.850750] [<ffffffffbd6d9ca1>] entry_SYSCALL64_slow_path+0x25/0x25
>> [241911.903564] ---[ end trace fae017546778f2b0 ]---
>> [241911.955332] ------------[ cut here ]------------
>> [241912.006262] WARNING: CPU: 1 PID: 26664 at
>> fs/btrfs/extent-tree.c:5702 btrfs_free_block_groups+0x40a/0x410 [btrfs]
>> [241912.059326] Modules linked in: netconsole mpt3sas ipt_REJECT
>> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
>> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
>> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
>> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
>> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
>> [241912.298666] CPU: 1 PID: 26664 Comm: umount Tainted: G W
>> 4.7.0-rc6-29043-g8b8b08c #1
>> [241912.363401] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
>> 02/18/2015
>> [241912.429395] 0000000000000000 ffff8808d104bca8 ffffffffbd3d83cf
>> 0000000000000000
>> [241912.497080] 0000000000000000 ffff8808d104bcf8 ffffffffbd085615
>> ffff8808d104bd08
>> [241912.565113] 000016465a3410a8 00000047a0000000 0000000000000000
>> ffff8808469e2088
>> [241912.634105] Call Trace:
>> [241912.702992] [<ffffffffbd3d83cf>] dump_stack+0x63/0x84
>> [241912.773473] [<ffffffffbd085615>] __warn+0xe5/0x100
>> [241912.844339] [<ffffffffbd08564d>] warn_slowpath_null+0x1d/0x20
>> [241912.916083] [<ffffffffc058cb2a>]
>> btrfs_free_block_groups+0x40a/0x410 [btrfs]
>> [241912.989103] [<ffffffffc059e7ab>] close_ctree+0x15b/0x330 [btrfs]
>> [241913.062672] [<ffffffffc056f089>] btrfs_put_super+0x19/0x20 [btrfs]
>> [241913.136364] [<ffffffffbd1deaff>] generic_shutdown_super+0x6f/0x100
>> [241913.208701] [<ffffffffbd1df026>] kill_anon_super+0x16/0x30
>> [241913.279194] [<ffffffffc05720fa>] btrfs_kill_super+0x1a/0xb0 [btrfs]
>> [241913.348065] [<ffffffffbd1df1f1>] deactivate_locked_super+0x51/0x90
>> [241913.415082] [<ffffffffbd1dfb8e>] deactivate_super+0x4e/0x70
>> [241913.479841] [<ffffffffbd1fba73>] cleanup_mnt+0x43/0x90
>> [241913.543353] [<ffffffffbd1fbb12>] __cleanup_mnt+0x12/0x20
>> [241913.605959] [<ffffffffbd0a1f61>] task_work_run+0x81/0xb0
>> [241913.667542] [<ffffffffbd07ffcd>] exit_to_usermode_loop+0x66/0x95
>> [241913.729612] [<ffffffffbd002a7d>] do_syscall_64+0x10d/0x150
>> [241913.791203] [<ffffffffbd6d9ca1>] entry_SYSCALL64_slow_path+0x25/0x25
>> [241913.852485] ---[ end trace fae017546778f2b1 ]---
>> [241913.913638] ------------[ cut here ]------------
>> [241913.974871] WARNING: CPU: 1 PID: 26664 at
>> fs/btrfs/extent-tree.c:10013 btrfs_free_block_groups+0x2ba/0x410 [btrfs]
>> [241914.039315] Modules linked in: netconsole mpt3sas ipt_REJECT
>> raid_class nf_reject_ipv4 scsi_transport_sas xt_multiport 8021q garp
>> iptable_filter ip_tables x_tables bonding coretemp loop usbhid ehci_pci
>> i2c_i801 ehci_hcd usbcore i2c_core shpchp usb_common ipmi_si
>> ipmi_msghandler button btrfs dm_mod raid1 raid456 async_raid6_recov
>> async_memcpy async_pq async_xor async_tx xor raid6_pq md_mod sg sd_mod
>> ixgbe i40e mdio ptp ahci libahci pps_core megaraid_sas
>> [241914.315918] CPU: 1 PID: 26664 Comm: umount Tainted: G W
>> 4.7.0-rc6-29043-g8b8b08c #1
>> [241914.388096] Hardware name: Supermicro X10DRH/X10DRH-iT, BIOS 1.0c
>> 02/18/2015
>> [241914.460679] 0000000000000000 ffff8808d104bca8 ffffffffbd3d83cf
>> 0000000000000000
>> [241914.534126] 0000000000000000 ffff8808d104bcf8 ffffffffbd085615
>> ffff8808d104bce8
>> [241914.607523] 0000271dbd3dac8c ffff88085184aac8 0000000000000038
>> 0000000000000000
>> [241914.681318] Call Trace:
>> [241914.754437] [<ffffffffbd3d83cf>] dump_stack+0x63/0x84
>> [241914.828796] [<ffffffffbd085615>] __warn+0xe5/0x100
>> [241914.902953] [<ffffffffbd08564d>] warn_slowpath_null+0x1d/0x20
>> [241914.977271] [<ffffffffc058c9da>]
>> btrfs_free_block_groups+0x2ba/0x410 [btrfs]
>> [241915.052041] [<ffffffffc059e7ab>] close_ctree+0x15b/0x330 [btrfs]
>> [241915.126282] [<ffffffffc056f089>] btrfs_put_super+0x19/0x20 [btrfs]
>> [241915.200758] [<ffffffffbd1deaff>] generic_shutdown_super+0x6f/0x100
>> [241915.273872] [<ffffffffbd1df026>] kill_anon_super+0x16/0x30
>> [241915.345132] [<ffffffffc05720fa>] btrfs_kill_super+0x1a/0xb0 [btrfs]
>> [241915.414703] [<ffffffffbd1df1f1>] deactivate_locked_super+0x51/0x90
>> [241915.482488] [<ffffffffbd1dfb8e>] deactivate_super+0x4e/0x70
>> [241915.547994] [<ffffffffbd1fba73>] cleanup_mnt+0x43/0x90
>> [241915.611962] [<ffffffffbd1fbb12>] __cleanup_mnt+0x12/0x20
>> [241915.674717] [<ffffffffbd0a1f61>] task_work_run+0x81/0xb0
>> [241915.736398] [<ffffffffbd07ffcd>] exit_to_usermode_loop+0x66/0x95
>> [241915.798592] [<ffffffffbd002a7d>] do_syscall_64+0x10d/0x150
>> [241915.860295] [<ffffffffbd6d9ca1>] entry_SYSCALL64_slow_path+0x25/0x25
>> [241915.921642] ---[ end trace fae017546778f2b2 ]---
>> [241915.982893] BTRFS: space_info 4 has 114577997824 free, is not full
>> [241916.045103] BTRFS: space_info total=307627032576, used=193048903680,
>> pinned=0, reserved=0, may_use=688537059328, readonly=131072
>>
>> Greets,
>> Stefan
>>
>> Am 10.08.2016 um 23:31 schrieb Stefan Priebe - Profihost AG:
>>> Hi Josef,
>>>
>>> same again with chris next branch:
>>>
>>> ERROR: error during balancing '/vmbackup/': No space left on device
>>> There may be more info in syslog - try dmesg | tail
>>> Dumping filters: flags 0x7, state 0x0, force is off
>>> DATA (flags 0x2): balancing, usage=5
>>> METADATA (flags 0x2): balancing, usage=5
>>> SYSTEM (flags 0x2): balancing, usage=5
>>>
>>> dmesg:
>>> [203784.411189] BTRFS info (device dm-0): 114 enospc errors during balance
>>>
>>> uname -r 4.7.0-rc6-29043-g8b8b08c
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 08.08.2016 um 08:17 schrieb Stefan Priebe - Profihost AG:
>>>> Am 04.08.2016 um 13:40 schrieb Stefan Priebe - Profihost AG:
>>>>> Am 29.07.2016 um 23:03 schrieb Josef Bacik:
>>>>>> On 07/29/2016 03:14 PM, Omar Sandoval wrote:
>>>>>>> On Fri, Jul 29, 2016 at 12:11:53PM -0700, Omar Sandoval wrote:
>>>>>>>> On Fri, Jul 29, 2016 at 08:40:26PM +0200, Stefan Priebe - Profihost
>>>>>>>> AG wrote:
>>>>>>>>> Dear list,
>>>>>>>>>
>>>>>>>>> i'm seeing btrfs no space messages frequently on big filesystems (>
>>>>>>>>> 30TB).
>>>>>>>>>
>>>>>>>>> In all cases i'm getting a trace like this one a space_info warning.
>>>>>>>>> (since commit [1]). Could someone please be so kind and help me
>>>>>>>>> debugging / fixing this bug? I'm using space_cache=v2 on all those
>>>>>>>>> systems.
>>>>>>>>
>>>>>>>> Hm, so I think this indicates a bug in space accounting somewhere else
>>>>>>>> rather than the free space tree itself. I haven't debugged one of these
>>>>>>>> issues before, I'll see if I can reproduce it. Cc'ing Josef, too.
>>>>>>>
>>>>>>> I should've asked, what sort of filesystem activity triggers this?
>>>>>>>
>>>>>>
>>>>>> Chris just fixed this I think, try his next branch from his git tree
>>>>>>
>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
>>>>>
>>>>> Thanks now running a 4.4 with those patches backported. If that still
>>>>> shows an error i will try that vanilla tree.
>>>>
>>>> OK this didn't work. I'll start / try using the linux-btrfs next branch
>>>> and look if this helps.
>>>>
>>>> Greets,
>>>> Stefan
>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Stefan
>>>>>
>>>>>> and see if it still happens. Thanks,
>>>>>>
>>>>>> Josef
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2016-08-29 14:02 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-29 18:40 memory overflow or undeflow in free space tree / space_info? Stefan Priebe - Profihost AG
2016-07-29 19:11 ` Omar Sandoval
2016-07-29 19:14 ` Omar Sandoval
2016-07-29 19:40 ` Stefan Priebe - Profihost AG
2016-07-29 21:03 ` Josef Bacik
2016-07-29 22:57 ` Holger Hoffstätte
2016-07-29 23:09 ` Holger Hoffstätte
2016-08-04 11:40 ` Stefan Priebe - Profihost AG
2016-08-08 6:17 ` Stefan Priebe - Profihost AG
2016-08-10 21:31 ` Stefan Priebe - Profihost AG
2016-08-11 6:09 ` Stefan Priebe - Profihost AG
2016-08-14 15:22 ` Stefan Priebe - Profihost AG
2016-08-29 14:02 ` Stefan Priebe - Profihost AG
2016-07-29 19:39 ` Stefan Priebe - Profihost AG
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).