* 4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!
@ 2017-11-18 0:49 Tomasz Chmielewski
2017-11-18 1:08 ` Hans van Kranenburg
0 siblings, 1 reply; 8+ messages in thread
From: Tomasz Chmielewski @ 2017-11-18 0:49 UTC (permalink / raw)
To: Btrfs BTRFS
I'm getting the following BUG when running balance on one of my systems:
[ 3458.698704] BTRFS info (device sdb3): relocating block group
306045779968 flags data|raid1
[ 3466.892933] BTRFS info (device sdb3): found 2405 extents
[ 3495.408630] BTRFS info (device sdb3): found 2405 extents
[ 3498.161144] ------------[ cut here ]------------
[ 3498.161150] kernel BUG at
/home/kernel/COD/linux/fs/btrfs/ctree.c:1856!
[ 3498.161264] invalid opcode: 0000 [#1] SMP
[ 3498.161363] Modules linked in: nf_log_ipv6 nf_log_ipv4 nf_log_common
xt_LOG xt_multiport xt_conntrack xt_nat binfmt_misc veth ip6table_filter
xt_CHECKSUM iptable_mangle xt_tcpudp ip6t_MASQUERADE
nf_nat_masquerade_ipv6 ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
nf_nat_ipv6 ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_comment
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
nf_conntrack iptable_filter ip_tables x_tables bridge stp llc intel_rapl
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel
aes_x86_64 crypto_simd glue_helper cryptd intel_cstate hci_uart
intel_rapl_perf btbcm input_leds serdev serio_raw btqca btintel
bluetooth intel_pch_thermal intel_lpss_acpi intel_lpss mac_hid acpi_pad
[ 3498.162060] ecdh_generic acpi_als kfifo_buf industrialio autofs4
btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy
async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath
linear raid1 e1000e psmouse ptp ahci pps_core libahci wmi
pinctrl_sunrisepoint i2c_hid video pinctrl_intel hid
[ 3498.162386] CPU: 7 PID: 29041 Comm: btrfs Not tainted
4.14.0-041400-generic #201711122031
[ 3498.162545] Hardware name: FUJITSU /D3401-H2, BIOS V5.0.0.12 R1.5.0
for D3401-H2x 02/27/2017
[ 3498.162723] task: ffff8d7858e82f00 task.stack: ffffb4ee47d5c000
[ 3498.162890] RIP: 0010:read_node_slot+0xd7/0xe0 [btrfs]
[ 3498.163027] RSP: 0018:ffffb4ee47d5fb88 EFLAGS: 00010246
[ 3498.163156] RAX: ffff8d78c8bb7000 RBX: ffff8d8124abd380 RCX:
0000000000000001
[ 3498.163290] RDX: 0000000000000048 RSI: ffff8d7ae1fef6f8 RDI:
ffff8d8124aa0000
[ 3498.163422] RBP: ffffb4ee47d5fba8 R08: 0000000000000001 R09:
ffff8d8124abd384
[ 3498.163555] R10: 0000000000000001 R11: 0000000000114000 R12:
0000000000000002
[ 3498.163689] R13: ffffb4ee47d5fc66 R14: ffffb4ee47d5fc50 R15:
0000000000000000
[ 3498.163825] FS: 00007fa4c9a998c0(0000) GS:ffff8d816e5c0000(0000)
knlGS:0000000000000000
[ 3498.163990] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3498.164120] CR2: 000056410155a028 CR3: 00000009c194c002 CR4:
00000000003606e0
[ 3498.164255] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 3498.164390] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 3498.164523] Call Trace:
[ 3498.164694] tree_advance+0x16e/0x1d0 [btrfs]
[ 3498.164874] btrfs_compare_trees+0x2da/0x6a0 [btrfs]
[ 3498.165078] ? process_extent+0x1580/0x1580 [btrfs]
[ 3498.165264] btrfs_ioctl_send+0xe94/0x1120 [btrfs]
[ 3498.165450] btrfs_ioctl+0x93c/0x1f00 [btrfs]
[ 3498.165587] ? enqueue_task_fair+0xa8/0x6c0
[ 3498.165724] do_vfs_ioctl+0xa5/0x600
[ 3498.165854] ? do_vfs_ioctl+0xa5/0x600
[ 3498.165979] ? _do_fork+0x144/0x3a0
[ 3498.166103] SyS_ioctl+0x79/0x90
[ 3498.166234] entry_SYSCALL_64_fastpath+0x1e/0xa9
[ 3498.166368] RIP: 0033:0x7fa4c8b17f07
[ 3498.166488] RSP: 002b:00007ffd33644e38 EFLAGS: 00000202 ORIG_RAX:
0000000000000010
[ 3498.166653] RAX: ffffffffffffffda RBX: 00007fa4c8a1a700 RCX:
00007fa4c8b17f07
[ 3498.166787] RDX: 00007ffd33644f30 RSI: 0000000040489426 RDI:
0000000000000004
[ 3498.166921] RBP: 00007ffd33644dc0 R08: 0000000000000000 R09:
00007fa4c8a1a700
[ 3498.167055] R10: 00007fa4c8a1a9d0 R11: 0000000000000202 R12:
0000000000000000
[ 3498.167190] R13: 00007ffd33644dbf R14: 00007fa4c8a1a9c0 R15:
000000000129f020
[ 3498.167326] Code: 48 c7 c3 fb ff ff ff e8 f8 5c 05 00 48 89 d8 5b 41
5c 41 5d 41 5e 5d c3 48 c7 c3 fe ff ff ff 48 89 d8 5b 41 5c 41 5d 41 5e
5d c3 <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41
[ 3498.167690] RIP: read_node_slot+0xd7/0xe0 [btrfs] RSP:
ffffb4ee47d5fb88
[ 3498.167892] ---[ end trace 6a751a3020dd3086 ]---
[ 3499.572729] BTRFS info (device sdb3): relocating block group
304972038144 flags data|raid1
[ 3504.068432] BTRFS info (device sdb3): found 2037 extents
[ 3538.281808] BTRFS info (device sdb3): found 2037 extents
Tomasz Chmielewski
https://lxadm.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!
2017-11-18 0:49 4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856! Tomasz Chmielewski
@ 2017-11-18 1:08 ` Hans van Kranenburg
2017-11-18 1:15 ` Tomasz Chmielewski
2017-11-18 9:40 ` Roman Mamedov
0 siblings, 2 replies; 8+ messages in thread
From: Hans van Kranenburg @ 2017-11-18 1:08 UTC (permalink / raw)
To: Tomasz Chmielewski, Btrfs BTRFS
On 11/18/2017 01:49 AM, Tomasz Chmielewski wrote:
> I'm getting the following BUG when running balance on one of my systems:
>
>
> [ 3458.698704] BTRFS info (device sdb3): relocating block group
> 306045779968 flags data|raid1
> [ 3466.892933] BTRFS info (device sdb3): found 2405 extents
> [ 3495.408630] BTRFS info (device sdb3): found 2405 extents
> [ 3498.161144] ------------[ cut here ]------------
> [ 3498.161150] kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!
> [ 3498.161264] invalid opcode: 0000 [#1] SMP
> [ 3498.161363] Modules linked in: nf_log_ipv6 nf_log_ipv4 nf_log_common
> xt_LOG xt_multiport xt_conntrack xt_nat binfmt_misc veth ip6table_filter
> xt_CHECKSUM iptable_mangle xt_tcpudp ip6t_MASQUERADE
> nf_nat_masquerade_ipv6 ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
> nf_nat_ipv6 ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_comment
> iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
> nf_conntrack iptable_filter ip_tables x_tables bridge stp llc intel_rapl
> x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel
> aes_x86_64 crypto_simd glue_helper cryptd intel_cstate hci_uart
> intel_rapl_perf btbcm input_leds serdev serio_raw btqca btintel
> bluetooth intel_pch_thermal intel_lpss_acpi intel_lpss mac_hid acpi_pad
> [ 3498.162060] ecdh_generic acpi_als kfifo_buf industrialio autofs4
> btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy
> async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath
> linear raid1 e1000e psmouse ptp ahci pps_core libahci wmi
> pinctrl_sunrisepoint i2c_hid video pinctrl_intel hid
> [ 3498.162386] CPU: 7 PID: 29041 Comm: btrfs Not tainted
> 4.14.0-041400-generic #201711122031
> [ 3498.162545] Hardware name: FUJITSU /D3401-H2, BIOS V5.0.0.12 R1.5.0
> for D3401-H2x 02/27/2017
> [ 3498.162723] task: ffff8d7858e82f00 task.stack: ffffb4ee47d5c000
> [ 3498.162890] RIP: 0010:read_node_slot+0xd7/0xe0 [btrfs]
> [ 3498.163027] RSP: 0018:ffffb4ee47d5fb88 EFLAGS: 00010246
> [ 3498.163156] RAX: ffff8d78c8bb7000 RBX: ffff8d8124abd380 RCX:
> 0000000000000001
> [ 3498.163290] RDX: 0000000000000048 RSI: ffff8d7ae1fef6f8 RDI:
> ffff8d8124aa0000
> [ 3498.163422] RBP: ffffb4ee47d5fba8 R08: 0000000000000001 R09:
> ffff8d8124abd384
> [ 3498.163555] R10: 0000000000000001 R11: 0000000000114000 R12:
> 0000000000000002
> [ 3498.163689] R13: ffffb4ee47d5fc66 R14: ffffb4ee47d5fc50 R15:
> 0000000000000000
> [ 3498.163825] FS: 00007fa4c9a998c0(0000) GS:ffff8d816e5c0000(0000)
> knlGS:0000000000000000
> [ 3498.163990] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3498.164120] CR2: 000056410155a028 CR3: 00000009c194c002 CR4:
> 00000000003606e0
> [ 3498.164255] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 3498.164390] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [ 3498.164523] Call Trace:
> [ 3498.164694] tree_advance+0x16e/0x1d0 [btrfs]
> [ 3498.164874] btrfs_compare_trees+0x2da/0x6a0 [btrfs]
> [ 3498.165078] ? process_extent+0x1580/0x1580 [btrfs]
> [ 3498.165264] btrfs_ioctl_send+0xe94/0x1120 [btrfs]
It's using send + balance at the same time. There's something that makes
btrfs explode when you do that.
It's not new in 4.14, I have seen it in 4.7 and 4.9 also, various
different explosions in kernel log. Since that happened, I made sure I
never did those two things at the same time.
> [ 3498.165450] btrfs_ioctl+0x93c/0x1f00 [btrfs]
> [ 3498.165587] ? enqueue_task_fair+0xa8/0x6c0
> [ 3498.165724] do_vfs_ioctl+0xa5/0x600
> [ 3498.165854] ? do_vfs_ioctl+0xa5/0x600
> [ 3498.165979] ? _do_fork+0x144/0x3a0
> [ 3498.166103] SyS_ioctl+0x79/0x90
> [ 3498.166234] entry_SYSCALL_64_fastpath+0x1e/0xa9
> [ 3498.166368] RIP: 0033:0x7fa4c8b17f07
> [ 3498.166488] RSP: 002b:00007ffd33644e38 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000010
> [ 3498.166653] RAX: ffffffffffffffda RBX: 00007fa4c8a1a700 RCX:
> 00007fa4c8b17f07
> [ 3498.166787] RDX: 00007ffd33644f30 RSI: 0000000040489426 RDI:
> 0000000000000004
> [ 3498.166921] RBP: 00007ffd33644dc0 R08: 0000000000000000 R09:
> 00007fa4c8a1a700
> [ 3498.167055] R10: 00007fa4c8a1a9d0 R11: 0000000000000202 R12:
> 0000000000000000
> [ 3498.167190] R13: 00007ffd33644dbf R14: 00007fa4c8a1a9c0 R15:
> 000000000129f020
> [ 3498.167326] Code: 48 c7 c3 fb ff ff ff e8 f8 5c 05 00 48 89 d8 5b 41
> 5c 41 5d 41 5e 5d c3 48 c7 c3 fe ff ff ff 48 89 d8 5b 41 5c 41 5d 41 5e
> 5d c3 <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41
> [ 3498.167690] RIP: read_node_slot+0xd7/0xe0 [btrfs] RSP: ffffb4ee47d5fb88
> [ 3498.167892] ---[ end trace 6a751a3020dd3086 ]---
> [ 3499.572729] BTRFS info (device sdb3): relocating block group
> 304972038144 flags data|raid1
> [ 3504.068432] BTRFS info (device sdb3): found 2037 extents
> [ 3538.281808] BTRFS info (device sdb3): found 2037 extents
--
Hans van Kranenburg
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!
2017-11-18 1:08 ` Hans van Kranenburg
@ 2017-11-18 1:15 ` Tomasz Chmielewski
2017-11-18 9:40 ` Roman Mamedov
1 sibling, 0 replies; 8+ messages in thread
From: Tomasz Chmielewski @ 2017-11-18 1:15 UTC (permalink / raw)
To: Hans van Kranenburg; +Cc: Btrfs BTRFS
On 2017-11-18 10:08, Hans van Kranenburg wrote:
> On 11/18/2017 01:49 AM, Tomasz Chmielewski wrote:
>> I'm getting the following BUG when running balance on one of my
>> systems:
>>
>>
>> [ 3458.698704] BTRFS info (device sdb3): relocating block group
>> 306045779968 flags data|raid1
>> [ 3466.892933] BTRFS info (device sdb3): found 2405 extents
>> [ 3495.408630] BTRFS info (device sdb3): found 2405 extents
>> [ 3498.161144] ------------[ cut here ]------------
>> [ 3498.161150] kernel BUG at
>> /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!
>> [ 3498.161264] invalid opcode: 0000 [#1] SMP
(...)
>> [ 3498.164523] Call Trace:
>> [ 3498.164694] tree_advance+0x16e/0x1d0 [btrfs]
>> [ 3498.164874] btrfs_compare_trees+0x2da/0x6a0 [btrfs]
>> [ 3498.165078] ? process_extent+0x1580/0x1580 [btrfs]
>> [ 3498.165264] btrfs_ioctl_send+0xe94/0x1120 [btrfs]
>
> It's using send + balance at the same time. There's something that
> makes
> btrfs explode when you do that.
>
> It's not new in 4.14, I have seen it in 4.7 and 4.9 also, various
> different explosions in kernel log. Since that happened, I made sure I
> never did those two things at the same time.
Indeed, send was started when balance was running.
Thanks for the hint.
Tomasz Chmielewski
https://lxadm.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!
2017-11-18 1:08 ` Hans van Kranenburg
2017-11-18 1:15 ` Tomasz Chmielewski
@ 2017-11-18 9:40 ` Roman Mamedov
2017-11-18 10:41 ` waxhead
2017-11-18 11:48 ` Hans van Kranenburg
1 sibling, 2 replies; 8+ messages in thread
From: Roman Mamedov @ 2017-11-18 9:40 UTC (permalink / raw)
To: Hans van Kranenburg; +Cc: Tomasz Chmielewski, Btrfs BTRFS
On Sat, 18 Nov 2017 02:08:46 +0100
Hans van Kranenburg <hans.van.kranenburg@mendix.com> wrote:
> It's using send + balance at the same time. There's something that makes
> btrfs explode when you do that.
>
> It's not new in 4.14, I have seen it in 4.7 and 4.9 also, various
> different explosions in kernel log. Since that happened, I made sure I
> never did those two things at the same time.
Shouldn't it prevent send during balance, or balance during send then, if
that's the case?
You talk about it "exploding" like it's a normal thing, to have Invalid opcode
BUGs in kernel log, and the user has to take care to not use two of the
regular FS features at the same time.
Seems to be a bug which should be fixed, rather than warning everyone "not to
send during balance".
--
With respect,
Roman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!
2017-11-18 9:40 ` Roman Mamedov
@ 2017-11-18 10:41 ` waxhead
2017-11-18 11:48 ` Hans van Kranenburg
1 sibling, 0 replies; 8+ messages in thread
From: waxhead @ 2017-11-18 10:41 UTC (permalink / raw)
To: Roman Mamedov, Hans van Kranenburg; +Cc: Tomasz Chmielewski, Btrfs BTRFS
Roman Mamedov wrote:
> On Sat, 18 Nov 2017 02:08:46 +0100
> Hans van Kranenburg <hans.van.kranenburg@mendix.com> wrote:
>
>> It's using send + balance at the same time. There's something that makes
>> btrfs explode when you do that.
>>
>> It's not new in 4.14, I have seen it in 4.7 and 4.9 also, various
>> different explosions in kernel log. Since that happened, I made sure I
>> never did those two things at the same time.
>
> Shouldn't it prevent send during balance, or balance during send then, if
> that's the case?
>
> You talk about it "exploding" like it's a normal thing, to have Invalid opcode
> BUGs in kernel log, and the user has to take care to not use two of the
> regular FS features at the same time.
>
> Seems to be a bug which should be fixed, rather than warning everyone "not to
> send during balance".
>
Agree, either that or someone needs to update the status page to show
that either send or balance is 'unstable'
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!
2017-11-18 9:40 ` Roman Mamedov
2017-11-18 10:41 ` waxhead
@ 2017-11-18 11:48 ` Hans van Kranenburg
2017-11-18 12:29 ` Hans van Kranenburg
2017-11-18 19:11 ` Hans van Kranenburg
1 sibling, 2 replies; 8+ messages in thread
From: Hans van Kranenburg @ 2017-11-18 11:48 UTC (permalink / raw)
To: Roman Mamedov; +Cc: Tomasz Chmielewski, Btrfs BTRFS
On 11/18/2017 10:40 AM, Roman Mamedov wrote:
> On Sat, 18 Nov 2017 02:08:46 +0100
> Hans van Kranenburg <hans.van.kranenburg@mendix.com> wrote:
>
>> It's using send + balance at the same time. There's something that makes
>> btrfs explode when you do that.
>>
>> It's not new in 4.14, I have seen it in 4.7 and 4.9 also, various
>> different explosions in kernel log. Since that happened, I made sure I
>> never did those two things at the same time.
>
> Shouldn't it prevent send during balance, or balance during send then, if
> that's the case?
>
> You talk about it "exploding" like it's a normal thing,
No, no, it's absolutely not normal of course. It's pretty scary.
> to have Invalid opcode
> BUGs in kernel log, and the user has to take care to not use two of the
> regular FS features at the same time.
>
> Seems to be a bug which should be fixed, rather than warning everyone "not to
> send during balance".
Yes, sure. In the meantime, watch out with that combination. And that's
what I quickly typed yesterday, because it was too late already.
My experience with reporting bugs like this is that to get time spent on
it, either a developer himself has to experience the issue, or you have
to be able to provide a reliable reproducer. And yes, I also never got
to doing that myself yet.
So, who wants to help?
1. Find a test system that you can crash.
2. Create a test filesystem with some data.
3. Run with 4.14? (makes the most sense I think)
4. Continuously feed the data to balance and send everything to /dev/null
5. Collect stack traces and borken filesystem images.
--
Hans van Kranenburg
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!
2017-11-18 11:48 ` Hans van Kranenburg
@ 2017-11-18 12:29 ` Hans van Kranenburg
2017-11-18 19:11 ` Hans van Kranenburg
1 sibling, 0 replies; 8+ messages in thread
From: Hans van Kranenburg @ 2017-11-18 12:29 UTC (permalink / raw)
To: Roman Mamedov; +Cc: Tomasz Chmielewski, Btrfs BTRFS
On 11/18/2017 12:48 PM, Hans van Kranenburg wrote:
>
> So, who wants to help?
>
> 1. Find a test system that you can crash.
> 2. Create a test filesystem with some data.
> 3. Run with 4.14? (makes the most sense I think)
> 4. Continuously feed the data to balance and send everything to /dev/null
> 5. Collect stack traces and borken filesystem images.
Ok, what I just did:
-# mkfs.btrfs --nodiscard /dev/xvdb
-# mount -o noatime /dev/xvdb /btrfs
-# cd btrfs/
-# btrfs sub create usr
-# rsync -av /usr/ usr/
-# btrfs sub snap -r usr/ usro
And then, both at the same time:
-# while true; do btrfs send usro/ > /dev/null; date; echo 3 >
/proc/sys/vm/drop_caches; done
and
-# while true; do btrfs balance start --full-balance /btrfs; done
After a few minutes we have the first error:
[ 1605.484627] BTRFS error (device xvdb): did not find backref in
send_root. inode=4260, offset=0, disk_byte=130793844736 found
extent=130793844736
Sat Nov 18 13:25:28 CET 2017
At subvol usro/
ERROR: send ioctl failed with -5: Input/output error
No big stack trace this time.
--
Hans van Kranenburg
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856!
2017-11-18 11:48 ` Hans van Kranenburg
2017-11-18 12:29 ` Hans van Kranenburg
@ 2017-11-18 19:11 ` Hans van Kranenburg
1 sibling, 0 replies; 8+ messages in thread
From: Hans van Kranenburg @ 2017-11-18 19:11 UTC (permalink / raw)
To: Roman Mamedov; +Cc: Tomasz Chmielewski, Btrfs BTRFS
On 11/18/2017 12:48 PM, Hans van Kranenburg wrote:
>
> So, who wants to help?
>
> 1. Find a test system that you can crash.
> 2. Create a test filesystem with some data.
> 3. Run with 4.14? (makes the most sense I think)
> 4. Continuously feed the data to balance and send everything to /dev/null
> 5. Collect stack traces and borken filesystem images.
I moved it to a ramdisk:
-# modprobe brd rd_nr=1 rd_size=4194304
-# mkfs.btrfs -m single -d single /dev/ram0
-# mount -o noatime /dev/ram0 /btrfs
-# cd /btrfs
-# btrfs sub create moo
-# rsync -av /usr/ moo/
-# btrfs sub snap -r moo/ moo-ro
Now remove part of the files, yolo style
-# rm $(find moo -type f | shuf | head -n 5000)
Put them back, so we have some differences for incremental send
-# rsync -av /usr/ moo/
-# btrfs sub snap -r moo/ moo-ro2
Now again:
-# while true; do btrfs balance start --full-balance /btrfs; done
and
-# while true; do btrfs send --no-data /btrfs/moo-ro/ | wc -c;
btrfs send --no-data -p /btrfs/moo-ro/ /btrfs/moo-ro4/ |wc -c;
date; done
Now I got rid of the disk traffic, and kernel cpu time goes to >300%.
The error seen before is easily triggered. It happens both on normal and
on incremental send. It happens both when using --no-data and when not
using that option. send aborts with "ERROR: send ioctl failed with -5:
Input/output error", and dmesg shows:
[...]
[17094.578876] BTRFS error (device ram0): did not find backref in
send_root. inode=28151, offset=0, disk_byte=8769369178112 found
extent=8769369178112
[17328.368458] BTRFS error (device ram0): did not find backref in
send_root. inode=23264, offset=0, disk_byte=8902861979648 found
extent=8902861979648
[17352.779099] BTRFS error (device ram0): did not find backref in
send_root. inode=17392, offset=0, disk_byte=8917230010368 found
extent=8917230010368
[18012.009357] BTRFS error (device ram0): did not find backref in
send_root. inode=29245, offset=0, disk_byte=9295300538368 found
extent=9295300538368
[18193.218649] BTRFS error (device ram0): did not find backref in
send_root. inode=16437, offset=0, disk_byte=9400309366784 found
extent=9400309366784
[18604.697898] BTRFS error (device ram0): did not find backref in
send_root. inode=24508, offset=0, disk_byte=9635165790208 found
extent=9635165790208
[18621.053722] BTRFS error (device ram0): did not find backref in
send_root. inode=10039, offset=0, disk_byte=9644150468608 found
extent=9644150468608
[19039.051399] BTRFS error (device ram0): did not find backref in
send_root. inode=29411, offset=0, disk_byte=9883807432704 found
extent=9883807432704
[19373.297701] BTRFS error (device ram0): did not find backref in
send_root. inode=7946, offset=0, disk_byte=10074868215808 found
extent=10074868215808
[19573.432255] BTRFS error (device ram0): did not find backref in
send_root. inode=26743, offset=0, disk_byte=10190374899712 found
extent=10190374899712
[19682.305240] BTRFS error (device ram0): did not find backref in
send_root. inode=24823, offset=0, disk_byte=10252750929920 found
extent=10252750929920
[20012.420346] BTRFS error (device ram0): did not find backref in
send_root. inode=25763, offset=0, disk_byte=10441684029440 found
extent=10441684029440
[20430.100411] BTRFS error (device ram0): did not find backref in
send_root. inode=14572, offset=0, disk_byte=10680836050944 found
extent=10680836050944
[21328.821766] BTRFS error (device ram0): did not find backref in
send_root. inode=11756, offset=0, disk_byte=11195322470400 found
extent=11195322470400
[...]
After a few hours I have a long list of those, but that's all so far. No
other big explosions.
So, if anyone has an idea of what to try next? Maybe it needs more than
1 block group each for data and metadata? Maybe speeding it up (with a
small amount of data and the ramdisk) does not increase the chance of
triggering something, but just decreases it?
Welcome to the world of trying to reproduce errors... :D
--
Hans van Kranenburg
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-11-18 19:11 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-18 0:49 4.14 balance: kernel BUG at /home/kernel/COD/linux/fs/btrfs/ctree.c:1856! Tomasz Chmielewski
2017-11-18 1:08 ` Hans van Kranenburg
2017-11-18 1:15 ` Tomasz Chmielewski
2017-11-18 9:40 ` Roman Mamedov
2017-11-18 10:41 ` waxhead
2017-11-18 11:48 ` Hans van Kranenburg
2017-11-18 12:29 ` Hans van Kranenburg
2017-11-18 19:11 ` Hans van Kranenburg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox