* kernel BUG at /build/buildd/linux-3.2.0/fs/btrfs/extent-tree.c:4816!
@ 2011-11-29 1:39 Karl Mardoff Kittilsen
2011-11-29 15:12 ` Chris Mason
0 siblings, 1 reply; 7+ messages in thread
From: Karl Mardoff Kittilsen @ 2011-11-29 1:39 UTC (permalink / raw)
To: linux-btrfs
Hi!
Sending a mail on this issue, as advised on IRC.
My /home file system fails to mount and the kernel seem to freeze and I
need to do the Alt+SysRq RSNEIUB routine to boot it safely.
The corruption happened on a 3.2-rc<something> kernel and Ubuntu 11.10,
but I am now running on Ubuntu 12.04 with the 3.2.0-2-generic kernel to
see if that helped, it did not.
btrfsck from the latest btrfs-tools returns:
karl@karl-precise:~/git/btrfs-progs$ sudo ./btrfsck /dev/md0
ref mismatch on [2176962560 8192] extent item 480, found 1
Incorrect local backref count on 2176970752 root 5 owner 2101705 offset
368640 found 1 wanted 3925868545
backpointer mismatch on [2176970752 4096]
found 1322579566593 bytes used err is 1
total csum bytes: 1288573748
total tree bytes: 3057922048
total fs tree bytes: 862068736
btree space waste bytes: 704584583
file data blocks allocated: 18991122972672
referenced 1361205268480
Btrfs Btrfs v0.19-dirty
The file system is on a md raid1 device, and the only thing that I have
done recently that might be related is that I made a script
to run through all my files and defrag them as well as compress them.
That completed without any errors and I gained about 10% of space :)
This was about 5 days ago, after that I used it like normal without any
problems.
Mount options are "defaults,compression=zlib"
This is the trace from dmesg when I try to mount it:
Nov 29 01:17:30 karl-precise kernel: [ 100.963449] ------------[ cut
here ]------------
Nov 29 01:17:30 karl-precise kernel: [ 100.963478] kernel BUG at
/build/buildd/linux-3.2.0/fs/btrfs/extent-tree.c:4816!
Nov 29 01:17:30 karl-precise kernel: [ 100.963516] invalid opcode: 0000
[#1] SMP
Nov 29 01:17:30 karl-precise kernel: [ 100.963534] CPU 3
Nov 29 01:17:30 karl-precise kernel: [ 100.963543] Modules linked in:
nls_iso8859_1 nls_cp437 vfat fat rfcomm bnep bluetooth parport_pc ppdev
binfmt_misc snd_hda_codec_hdmi arc4 rt2500usb rt2x00usb rt2x00lib
mac80211 snd_hda_codec_realtek cfg80211 snd_hda_intel snd_hda_codec
snd_hwdep snd_pcm snd_seq_midi radeon snd_rawmidi snd_seq_midi_event
snd_seq psmouse snd_timer snd_seq_device snd ttm sp5100_tco
drm_kms_helper drm soundcore snd_page_alloc i2c_algo_bit i2c_piix4
edac_core wmi asus_atk0110 k10temp serio_raw edac_mce_amd lp parport
raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov
usb_storage uas usbhid hid raid6_pq async_tx raid0 multipath raid1
linear pata_atiixp btrfs zlib_deflate firewire_ohci firewire_core
crc_itu_t r8169 libcrc32c
Nov 29 01:17:30 karl-precise kernel: [ 100.963855]
Nov 29 01:17:30 karl-precise kernel: [ 100.963862] Pid: 2184, comm:
mount Not tainted 3.2.0-2-generic #4-Ubuntu System manufacturer System
Product Name/M4A79T Deluxe
Nov 29 01:17:30 karl-precise kernel: [ 100.963908] RIP:
0010:[<ffffffffa0060ef7>] [<ffffffffa0060ef7>]
__btrfs_free_extent+0x617/0x650 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.963958] RSP:
0018:ffff880404ec9778 EFLAGS: 00010207
Nov 29 01:17:30 karl-precise kernel: [ 100.963979] RAX:
00000000ea000001 RBX: ffff8803e23ce000 RCX: 0000000000000000
Nov 29 01:17:30 karl-precise kernel: [ 100.964006] RDX:
ffff880000000000 RSI: 00000000000007ad RDI: ffff8803e23d0280
Nov 29 01:17:30 karl-precise kernel: [ 100.964046] RBP:
ffff880404ec9838 R08: 00000000000007b1 R09: 0000000000000000
Nov 29 01:17:30 karl-precise kernel: [ 100.964078] R10:
000000000000000d R11: ffff8803dac09840 R12: 000000000000002c
Nov 29 01:17:30 karl-precise kernel: [ 100.964109] R13:
0000000081c1f000 R14: 0000000000001000 R15: 0000000000000000
Nov 29 01:17:30 karl-precise kernel: [ 100.964141] FS:
00007f2290850820(0000) GS:ffff88042fcc0000(0000) knlGS:0000000000000000
Nov 29 01:17:30 karl-precise kernel: [ 100.964177] CS: 0010 DS: 0000
ES: 0000 CR0: 000000008005003b
Nov 29 01:17:30 karl-precise kernel: [ 100.964203] CR2:
00007f641727a000 CR3: 00000003ea2cf000 CR4: 00000000000006e0
Nov 29 01:17:30 karl-precise kernel: [ 100.964235] DR0:
0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 29 01:17:30 karl-precise kernel: [ 100.964266] DR3:
0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Nov 29 01:17:30 karl-precise kernel: [ 100.964298] Process mount (pid:
2184, threadinfo ffff880404ec8000, task ffff8803ea29c530)
Nov 29 01:17:30 karl-precise kernel: [ 100.964334] Stack:
Nov 29 01:17:30 karl-precise kernel: [ 100.964344] 0000000000000000
0000000000000005 00000000002011c9 000000000005a000
Nov 29 01:17:30 karl-precise kernel: [ 100.964386] ffff880400000035
ffff880414f52000 0000000100000001 ffff8803e7a0e800
Nov 29 01:17:30 karl-precise kernel: [ 100.964417] ffff8803e7a0fc00
ffff8803e23cf000 000000000000077c ffff8803e23d0280
Nov 29 01:17:30 karl-precise kernel: [ 100.964449] Call Trace:
Nov 29 01:17:30 karl-precise kernel: [ 100.964467]
[<ffffffffa0061180>] run_delayed_data_ref+0xb0/0x1a0 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.964496]
[<ffffffff8116087f>] ? kmem_cache_free+0x2f/0x110
Nov 29 01:17:30 karl-precise kernel: [ 100.965751]
[<ffffffffa0064b3e>] run_one_delayed_ref+0x8e/0xf0 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.966996]
[<ffffffffa0064c74>] run_clustered_refs+0xd4/0x240 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffffa0064eaa>] btrfs_run_delayed_refs+0xca/0x220 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff8165135d>] ? mutex_lock+0x1d/0x50
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffffa008ede6>] ? btrfs_run_ordered_operations+0x1d6/0x1f0 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffffa0074f53>] btrfs_commit_transaction+0x93/0x840 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff81089c50>] ? add_wait_queue+0x60/0x60
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff8116087f>] ? kmem_cache_free+0x2f/0x110
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffffa00a8982>] btrfs_recover_log_trees+0x2d2/0x300 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffffa00a75e0>] ? fixup_inode_link_counts+0x150/0x150 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffffa0073411>] open_ctree+0x1471/0x1920 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff81311d74>] ? snprintf+0x34/0x40
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffffa00c2582>] btrfs_fill_super.isra.38+0x72/0x12c [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff811e1d7a>] ? disk_name+0xba/0xc0
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff8130f397>] ? strlcpy+0x47/0x60
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffffa0052807>] btrfs_mount+0x497/0x4e0 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff81179b43>] mount_fs+0x43/0x1b0
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff811941ba>] vfs_kern_mount+0x6a/0xc0
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff81195664>] do_kern_mount+0x54/0x110
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff811971b4>] do_mount+0x1a4/0x260
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff81197690>] sys_mount+0x90/0xe0
Nov 29 01:17:30 karl-precise kernel: [ 100.967397]
[<ffffffff8165ad02>] system_call_fastpath+0x16/0x1b
Nov 29 01:17:30 karl-precise kernel: [ 100.967397] Code: 0f 85 94 fa ff
ff 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 48 8b 55 c8 48 8b 3b 48 8d 73 40
e8 98 17 06 00 39 45 20 0f 84 e9 fd ff ff <0f> 0b 0f 0b 89 c6 4c 89 ea
31 c0 48 c7 c7 48 9d 0c a0 e8 7b 93
Nov 29 01:17:30 karl-precise kernel: [ 100.967397] RIP
[<ffffffffa0060ef7>] __btrfs_free_extent+0x617/0x650 [btrfs]
Nov 29 01:17:30 karl-precise kernel: [ 100.967397] RSP <ffff880404ec9778>
Nov 29 01:17:30 karl-precise kernel: [ 101.005914] ---[ end trace
ae54b272e480df0f ]---
--------------- After digging through some log files I found the first
occurrence of this error, with some new log lines -----------
These lines occurred just before the first time the partition became
unmountable:
Nov 27 23:45:47 karl-workstation kernel: [211390.634303] btrfs csum
failed ino 3738022 off 1819189248 csum 318166411 private 1787547189
Nov 27 23:45:54 karl-workstation kernel: [211398.556254] btrfs csum
failed ino 3738022 off 1819189248 csum 2203380165 private 1787547189
Nov 27 23:45:55 karl-workstation kernel: [211398.676454] btrfs csum
failed ino 3738022 off 1819189248 csum 2203380165 private 1787547189
Nov 27 23:45:55 karl-workstation kernel: [211398.679193] btrfs csum
failed ino 3738022 off 1819189248 csum 2203380165 private 1787547189
And then this
Nov 28 00:11:14 karl-workstation kernel: [212918.235045] ------------[
cut here ]------------
Nov 28 00:11:14 karl-workstation kernel: [212918.235050] kernel BUG at
/home/apw/COD/linux/fs/btrfs/extent-tree.c:4775!
Nov 28 00:11:14 karl-workstation kernel: [212918.235052] invalid opcode:
0000 [#1] SMP
Nov 28 00:11:14 karl-workstation kernel: [212918.235054] CPU 0
Nov 28 00:11:14 karl-workstation kernel: [212918.235056] Modules linked
in: nls_iso8859_1 nls_cp437 vfat fat bnep rfcomm bluetooth
ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat nf_nat
nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT
xt_CHECKSUM iptable_mangle xt_tcpudp nfsd iptable_filter lockd ip_tables
nfs_acl x_tables auth_rpcgss sunrpc bridge stp kvm_amd kvm ppdev
binfmt_misc arc4 rt2500usb rt2x00usb rt2x00lib mac80211 cfg80211
snd_hda_codec_hdmi snd_hda_codec_realtek fglrx(P) snd_hda_intel psmouse
snd_seq_midi snd_hda_codec snd_rawmidi snd_hwdep snd_seq_midi_event
snd_pcm snd_seq edac_core serio_raw edac_mce_amd k10temp sp5100_tco
snd_seq_device i2c_piix4 snd_timer asus_atk0110 snd soundcore
snd_page_alloc wmi lp parport raid10 raid456 async_pq async_xor xor
async_memcpy async_raid6_recov usb_storage uas usbhid hid raid6_pq
async_tx raid1 pata_atiixp raid0 firewire_ohci ahci libahci multipath
firewire_core crc_itu_t linear btrfs r8169 zlib_deflate libcrc32c [last
unloaded: parport_pc]
Nov 28 00:11:14 karl-workstation kernel: [212918.235092]
Nov 28 00:11:14 karl-workstation kernel: [212918.235094] Pid: 6962,
comm: btrfs-endio-wri Tainted: P O 3.2.0-999-generic
#201111220410 System manufacturer System Product Name/M4A79T Deluxe
Nov 28 00:11:14 karl-workstation kernel: [212918.235098] RIP:
0010:[<ffffffffa002b910>] [<ffffffffa002b910>]
__btrfs_free_extent+0x6c0/0x700 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235117] RSP:
0018:ffff880380173990 EFLAGS: 00010207
Nov 28 00:11:14 karl-workstation kernel: [212918.235118] RAX:
00000000ea000001 RBX: ffff880412c3ab40 RCX: ffff880380173900
Nov 28 00:11:14 karl-workstation kernel: [212918.235120] RDX:
ffff880000000000 RSI: 00000000000007ad RDI: ffff88027db9a8c0
Nov 28 00:11:14 karl-workstation kernel: [212918.235121] RBP:
ffff880380173a80 R08: 00000000000007b1 R09: ffff8803801738f0
Nov 28 00:11:14 karl-workstation kernel: [212918.235123] R10:
0000000000000000 R11: 0000000000000000 R12: 000000000000002c
Nov 28 00:11:14 karl-workstation kernel: [212918.235124] R13:
0000000081c1f000 R14: 0000000000000001 R15: 0000000000000001
Nov 28 00:11:14 karl-workstation kernel: [212918.235126] FS:
00007fd5b95399c0(0000) GS:ffff88042fc00000(0000) knlGS:00000000f67d8880
Nov 28 00:11:14 karl-workstation kernel: [212918.235127] CS: 0010 DS:
0000 ES: 0000 CR0: 000000008005003b
Nov 28 00:11:14 karl-workstation kernel: [212918.235129] CR2:
00007f3a8bbd7000 CR3: 00000003452e1000 CR4: 00000000000006f0
Nov 28 00:11:14 karl-workstation kernel: [212918.235130] DR0:
0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 28 00:11:14 karl-workstation kernel: [212918.235132] DR3:
0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Nov 28 00:11:14 karl-workstation kernel: [212918.235133] Process
btrfs-endio-wri (pid: 6962, threadinfo ffff880380172000, task
ffff8803f47d16f0)
Nov 28 00:11:14 karl-workstation kernel: [212918.235135] Stack:
Nov 28 00:11:14 karl-workstation kernel: [212918.235136]
0000000000000000 0000000000000005 00000000002011c9 000000000005a000
Nov 28 00:11:14 karl-workstation kernel: [212918.235138]
0000160000000000 0000000000000000 0000000200000033 ffff880000000035
Nov 28 00:11:14 karl-workstation kernel: [212918.235140]
0000000112f78030 ffff8804146ee000 0000000100001000 ffff88041194a000
Nov 28 00:11:14 karl-workstation kernel: [212918.235143] Call Trace:
Nov 28 00:11:14 karl-workstation kernel: [212918.235153]
[<ffffffffa002bc04>] run_delayed_data_ref+0x154/0x160 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235162]
[<ffffffffa001a203>] ? leaf_space_used+0xc3/0xf0 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235171]
[<ffffffffa002bcba>] run_one_delayed_ref+0xaa/0xc0 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235180]
[<ffffffffa002bd90>] run_clustered_refs+0xc0/0x220 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235189]
[<ffffffffa002bfba>] btrfs_run_delayed_refs+0xca/0x220 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235193]
[<ffffffff8160f27e>] ? _raw_spin_lock+0xe/0x20
Nov 28 00:11:14 karl-workstation kernel: [212918.235203]
[<ffffffffa003b08f>] __btrfs_end_transaction+0xbf/0x250 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235213]
[<ffffffffa003b295>] btrfs_end_transaction+0x15/0x20 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235223]
[<ffffffffa00403cb>] btrfs_finish_ordered_io+0x16b/0x340 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235233]
[<ffffffffa00405f1>] btrfs_writepage_end_io_hook+0x51/0xa0 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235244]
[<ffffffffa0056c8b>] end_bio_extent_writepage+0x13b/0x180 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235247]
[<ffffffff8160d66b>] ? schedule_timeout+0x18b/0x2e0
Nov 28 00:11:14 karl-workstation kernel: [212918.235250]
[<ffffffff811ab9dd>] bio_endio+0x1d/0x40
Nov 28 00:11:14 karl-workstation kernel: [212918.235259]
[<ffffffffa0034ef4>] end_workqueue_fn+0xf4/0x130 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235269]
[<ffffffffa0063f8c>] worker_loop+0x15c/0x4c0 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235279]
[<ffffffffa0063e30>] ? check_pending_worker_creates+0xd0/0xd0 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235283]
[<ffffffff81088536>] kthread+0x96/0xa0
Nov 28 00:11:14 karl-workstation kernel: [212918.235285]
[<ffffffff816197f4>] kernel_thread_helper+0x4/0x10
Nov 28 00:11:14 karl-workstation kernel: [212918.235288]
[<ffffffff810884a0>] ? kthread_worker_fn+0x190/0x190
Nov 28 00:11:14 karl-workstation kernel: [212918.235290]
[<ffffffff816197f0>] ? gs_change+0x13/0x13
Nov 28 00:11:14 karl-workstation kernel: [212918.235291] Code: 8b bd 70
ff ff ff e8 00 22 00 00 0f 0b eb fe 48 8b 55 c8 48 8b bd 68 ff ff ff 48
89 de e8 49 b5 ff ff 39 45 20 0f 84 78 fd ff ff <0f> 0b eb fe 0f 0b eb
fe 0f 0b eb fe 0f 0b eb fe 0f 0b eb fe be
Nov 28 00:11:14 karl-workstation kernel: [212918.235309] RIP
[<ffffffffa002b910>] __btrfs_free_extent+0x6c0/0x700 [btrfs]
Nov 28 00:11:14 karl-workstation kernel: [212918.235317] RSP
<ffff880380173990>
Nov 28 00:11:14 karl-workstation kernel: [212918.235320] ---[ end trace
7c26e4285890c533 ]---
And then I had to reboot the system as it became unresponsive.
If you need any more info I will be more than happy to help out.
Karl M. Kittilsen
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kernel BUG at /build/buildd/linux-3.2.0/fs/btrfs/extent-tree.c:4816!
2011-11-29 1:39 kernel BUG at /build/buildd/linux-3.2.0/fs/btrfs/extent-tree.c:4816! Karl Mardoff Kittilsen
@ 2011-11-29 15:12 ` Chris Mason
2011-11-29 15:29 ` Karl Mardoff Kittilsen
0 siblings, 1 reply; 7+ messages in thread
From: Chris Mason @ 2011-11-29 15:12 UTC (permalink / raw)
To: Karl Mardoff Kittilsen; +Cc: linux-btrfs
On Tue, Nov 29, 2011 at 02:39:26AM +0100, Karl Mardoff Kittilsen wrote:
> Hi!
>
> Sending a mail on this issue, as advised on IRC.
>
> My /home file system fails to mount and the kernel seem to freeze
> and I need to do the Alt+SysRq RSNEIUB routine to boot it safely.
> The corruption happened on a 3.2-rc<something> kernel and Ubuntu
> 11.10, but I am now running on Ubuntu 12.04 with the 3.2.0-2-generic
> kernel to see if that helped, it did not.
> btrfsck from the latest btrfs-tools returns:
>
> karl@karl-precise:~/git/btrfs-progs$ sudo ./btrfsck /dev/md0
> ref mismatch on [2176962560 8192] extent item 480, found 1
> Incorrect local backref count on 2176970752 root 5 owner 2101705
> offset 368640 found 1 wanted 3925868545
> backpointer mismatch on [2176970752 4096]
So the crashes below were because we tried to free one of these extents.
You have two extents whose reference counts are way off.
Unfortunately this is stored on disk, so different kernels aren't going
to fix it (yet). One of the extents is in a file with inode number
2101705, and the other is in a btree block (2176962560).
I'll be able to fix this soon, but we can also make a patch that changes
those BUG_ONs to just deal with the mismatch. The worst case here would
be leaking those two extents, about 12K of data.
-chris
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kernel BUG at /build/buildd/linux-3.2.0/fs/btrfs/extent-tree.c:4816!
2011-11-29 15:12 ` Chris Mason
@ 2011-11-29 15:29 ` Karl Mardoff Kittilsen
2011-11-29 15:49 ` Chris Mason
0 siblings, 1 reply; 7+ messages in thread
From: Karl Mardoff Kittilsen @ 2011-11-29 15:29 UTC (permalink / raw)
To: Chris Mason; +Cc: linux-btrfs
Den 29. nov. 2011 16:12, skrev Chris Mason:
> On Tue, Nov 29, 2011 at 02:39:26AM +0100, Karl Mardoff Kittilsen wrote:
>> Hi!
>>
>> Sending a mail on this issue, as advised on IRC.
>>
>> My /home file system fails to mount and the kernel seem to freeze
>> and I need to do the Alt+SysRq RSNEIUB routine to boot it safely.
>> The corruption happened on a 3.2-rc<something> kernel and Ubuntu
>> 11.10, but I am now running on Ubuntu 12.04 with the 3.2.0-2-generic
>> kernel to see if that helped, it did not.
>> btrfsck from the latest btrfs-tools returns:
>>
>> karl@karl-precise:~/git/btrfs-progs$ sudo ./btrfsck /dev/md0
>> ref mismatch on [2176962560 8192] extent item 480, found 1
>> Incorrect local backref count on 2176970752 root 5 owner 2101705
>> offset 368640 found 1 wanted 3925868545
>> backpointer mismatch on [2176970752 4096]
>
> So the crashes below were because we tried to free one of these extents.
> You have two extents whose reference counts are way off.
>
> Unfortunately this is stored on disk, so different kernels aren't going
> to fix it (yet). One of the extents is in a file with inode number
> 2101705, and the other is in a btree block (2176962560).
>
> I'll be able to fix this soon, but we can also make a patch that changes
> those BUG_ONs to just deal with the mismatch. The worst case here would
> be leaking those two extents, about 12K of data.
>
> -chris
Thank you for looking into it, and that does sounds really promising. I
am available to test any patches you want tested. Is there anything else
that I can do to help getting this issue fixed?
Karl
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kernel BUG at /build/buildd/linux-3.2.0/fs/btrfs/extent-tree.c:4816!
2011-11-29 15:29 ` Karl Mardoff Kittilsen
@ 2011-11-29 15:49 ` Chris Mason
2011-11-29 16:47 ` David Sterba
0 siblings, 1 reply; 7+ messages in thread
From: Chris Mason @ 2011-11-29 15:49 UTC (permalink / raw)
To: Karl Mardoff Kittilsen; +Cc: linux-btrfs
On Tue, Nov 29, 2011 at 04:29:54PM +0100, Karl Mardoff Kittilsen wrote:
> Den 29. nov. 2011 16:12, skrev Chris Mason:
> >On Tue, Nov 29, 2011 at 02:39:26AM +0100, Karl Mardoff Kittilsen wrote:
> >>Hi!
> >>
> >>Sending a mail on this issue, as advised on IRC.
> >>
> >>My /home file system fails to mount and the kernel seem to freeze
> >>and I need to do the Alt+SysRq RSNEIUB routine to boot it safely.
> >>The corruption happened on a 3.2-rc<something> kernel and Ubuntu
> >>11.10, but I am now running on Ubuntu 12.04 with the 3.2.0-2-generic
> >>kernel to see if that helped, it did not.
> >>btrfsck from the latest btrfs-tools returns:
> >>
> >>karl@karl-precise:~/git/btrfs-progs$ sudo ./btrfsck /dev/md0
> >>ref mismatch on [2176962560 8192] extent item 480, found 1
> >>Incorrect local backref count on 2176970752 root 5 owner 2101705
> >>offset 368640 found 1 wanted 3925868545
> >>backpointer mismatch on [2176970752 4096]
> >
> >So the crashes below were because we tried to free one of these extents.
> >You have two extents whose reference counts are way off.
> >
> >Unfortunately this is stored on disk, so different kernels aren't going
> >to fix it (yet). One of the extents is in a file with inode number
> >2101705, and the other is in a btree block (2176962560).
> >
> >I'll be able to fix this soon, but we can also make a patch that changes
> >those BUG_ONs to just deal with the mismatch. The worst case here would
> >be leaking those two extents, about 12K of data.
> >
> >-chris
>
> Thank you for looking into it, and that does sounds really
> promising. I am available to test any patches you want tested. Is
> there anything else that I can do to help getting this issue fixed?
The good news about this one is that it is very clear cut. The hard
part is figuring out where these bogus link counts came from.
I'd suggest that you spend some time running memtest on the machine.
-chris
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kernel BUG at /build/buildd/linux-3.2.0/fs/btrfs/extent-tree.c:4816!
2011-11-29 15:49 ` Chris Mason
@ 2011-11-29 16:47 ` David Sterba
2011-11-29 18:12 ` Chris Mason
0 siblings, 1 reply; 7+ messages in thread
From: David Sterba @ 2011-11-29 16:47 UTC (permalink / raw)
To: Chris Mason, Karl Mardoff Kittilsen, linux-btrfs
On Tue, Nov 29, 2011 at 10:49:13AM -0500, Chris Mason wrote:
> The good news about this one is that it is very clear cut. The hard
> part is figuring out where these bogus link counts came from.
>
> I'd suggest that you spend some time running memtest on the machine.
Just to add some evidence from the log:
Nov 28 00:11:14 karl-workstation kernel: [212918.235050] kernel BUG at
/home/apw/COD/linux/fs/btrfs/extent-tree.c:4775!
Nov 28 00:11:14 karl-workstation kernel: [212918.235118] RAX:
00000000ea000001 RBX: ffff880412c3ab40 RCX: ffff880380173900
^^^^^^^^^^^^^^^^
4765 ret = btrfs_search_slot(trans, extent_root,
4766 &key, path, -1, 1);
4767 if (ret) {
4768 printk(KERN_ERR "umm, got %d back from search"
4769 ", was looking for %llu\n", ret,
4770 (unsigned long long)bytenr);
4771 if (ret > 0)
4772 btrfs_print_leaf(extent_root,
4773 path->nodes[0]);
4774 }
4775 BUG_ON(ret);
the ret value comes from btrfs_search_slot, returning " < 0" or 1, but
RAX has some extra bits set, this could really be a RAM failure.
david
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kernel BUG at /build/buildd/linux-3.2.0/fs/btrfs/extent-tree.c:4816!
2011-11-29 16:47 ` David Sterba
@ 2011-11-29 18:12 ` Chris Mason
2011-12-15 0:01 ` David Sterba
0 siblings, 1 reply; 7+ messages in thread
From: Chris Mason @ 2011-11-29 18:12 UTC (permalink / raw)
To: Karl Mardoff Kittilsen, linux-btrfs
On Tue, Nov 29, 2011 at 05:47:46PM +0100, David Sterba wrote:
> On Tue, Nov 29, 2011 at 10:49:13AM -0500, Chris Mason wrote:
> > The good news about this one is that it is very clear cut. The hard
> > part is figuring out where these bogus link counts came from.
> >
> > I'd suggest that you spend some time running memtest on the machine.
>
> Just to add some evidence from the log:
>
> Nov 28 00:11:14 karl-workstation kernel: [212918.235050] kernel BUG at
> /home/apw/COD/linux/fs/btrfs/extent-tree.c:4775!
> Nov 28 00:11:14 karl-workstation kernel: [212918.235118] RAX:
> 00000000ea000001 RBX: ffff880412c3ab40 RCX: ffff880380173900
> ^^^^^^^^^^^^^^^^
>
> 4765 ret = btrfs_search_slot(trans, extent_root,
> 4766 &key, path, -1, 1);
> 4767 if (ret) {
> 4768 printk(KERN_ERR "umm, got %d back from search"
> 4769 ", was looking for %llu\n", ret,
> 4770 (unsigned long long)bytenr);
> 4771 if (ret > 0)
> 4772 btrfs_print_leaf(extent_root,
> 4773 path->nodes[0]);
> 4774 }
> 4775 BUG_ON(ret);
>
> the ret value comes from btrfs_search_slot, returning " < 0" or 1, but
> RAX has some extra bits set, this could really be a RAM failure.
>
>
> david
Interesting, look at this:
> karl@karl-precise:~/git/btrfs-progs$ sudo ./btrfsck /dev/md0
> ref mismatch on [2176962560 8192] extent item 480, found 1
> Incorrect local backref count on 2176970752 root 5 owner 2101705
> offset 368640 found 1 wanted 3925868545
> backpointer mismatch on [2176970752 4096]
3925868545 == EA000001
Are you sure this is the BUG_ON he was triggering?
-chris
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: kernel BUG at /build/buildd/linux-3.2.0/fs/btrfs/extent-tree.c:4816!
2011-11-29 18:12 ` Chris Mason
@ 2011-12-15 0:01 ` David Sterba
0 siblings, 0 replies; 7+ messages in thread
From: David Sterba @ 2011-12-15 0:01 UTC (permalink / raw)
To: Chris Mason, Karl Mardoff Kittilsen, linux-btrfs
On Tue, Nov 29, 2011 at 01:12:14PM -0500, Chris Mason wrote:
> > Nov 28 00:11:14 karl-workstation kernel: [212918.235050] kernel BUG at
> > /home/apw/COD/linux/fs/btrfs/extent-tree.c:4775!
> > Nov 28 00:11:14 karl-workstation kernel: [212918.235118] RAX:
> > 00000000ea000001 RBX: ffff880412c3ab40 RCX: ffff880380173900
> > ^^^^^^^^^^^^^^^^
> >
> > 4765 ret = btrfs_search_slot(trans, extent_root,
> > 4766 &key, path, -1, 1);
> > 4767 if (ret) {
> > 4768 printk(KERN_ERR "umm, got %d back from search"
> > 4769 ", was looking for %llu\n", ret,
> > 4770 (unsigned long long)bytenr);
> > 4771 if (ret > 0)
> > 4772 btrfs_print_leaf(extent_root,
> > 4773 path->nodes[0]);
> > 4774 }
> > 4775 BUG_ON(ret);
> >
> > the ret value comes from btrfs_search_slot, returning " < 0" or 1, but
> > RAX has some extra bits set, this could really be a RAM failure.
>
> Interesting, look at this:
>
> > karl@karl-precise:~/git/btrfs-progs$ sudo ./btrfsck /dev/md0
> > ref mismatch on [2176962560 8192] extent item 480, found 1
> > Incorrect local backref count on 2176970752 root 5 owner 2101705
> > offset 368640 found 1 wanted 3925868545
> > backpointer mismatch on [2176970752 4096]
>
> 3925868545 == EA000001
I applied usual first analysis steps (source line, registers, call
chain), search slot could return 1 and taking a memory failure into
account looks possible, though bit count of 'EA' is 5, seems too high.
> Are you sure this is the BUG_ON he was triggering?
This was referring to the second BUG_ON in the logs. I checked the first
BUG_ON again and see:
kernel: [ 100.963478] kernel BUG at
/build/buildd/linux-3.2.0/fs/btrfs/extent-tree.c:4816!
RAX: 00000000ea000001
4815 if (iref) {
4816 BUG_ON(!found_extent);
4817 } else {
4818 btrfs_set_extent_refs(leaf, ei, refs);
4819 btrfs_mark_buffer_dirty(leaf);
4820 }
found_extent is int and modified at
4686 int found_extent = 0;
and
4712 if (key.type == BTRFS_EXTENT_ITEM_KEY &&
4713 key.offset == num_bytes) {
4714 found_extent = 1;
4715 break;
4716 }
This looks like a crappy memory as well.
> > offset 368640 found 1 wanted 3925868545
> 3925868545 == EA000001
"found 1 wanted 1"
david
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-12-15 0:01 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-29 1:39 kernel BUG at /build/buildd/linux-3.2.0/fs/btrfs/extent-tree.c:4816! Karl Mardoff Kittilsen
2011-11-29 15:12 ` Chris Mason
2011-11-29 15:29 ` Karl Mardoff Kittilsen
2011-11-29 15:49 ` Chris Mason
2011-11-29 16:47 ` David Sterba
2011-11-29 18:12 ` Chris Mason
2011-12-15 0:01 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).