* Hardware failure or btrfs issue?
@ 2013-07-01 22:56 Peter Chant
2013-07-02 7:29 ` Hugo Mills
0 siblings, 1 reply; 5+ messages in thread
From: Peter Chant @ 2013-07-01 22:56 UTC (permalink / raw)
To: linux-btrfs
Sirs,
my recently slowing file system is now going read only after trying a
defrag or other operation. I'm wondering whether this is the result of
a hardware failure or a btrfs or some other issue. Output of dmesg:
127.750401] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 127.750494] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 127.750590] Process btrfs-cleaner (pid: 1346, threadinfo
ffff8800687ec000, task ffff88006d742a00)
[ 127.750704] Stack:
[ 127.750733] ffff880068024c38 ffff88006a9a0438 ffff8800687ede48
ffff880069928800
[ 127.750850] ffff88006d742a00 ffff88006d742a00 ffff88006d742a00
0000000000000000
[ 127.750968] ffff8800687edeb8 ffffffff812b8c29 ffff880069928800
0000000000000000
[ 127.751085] Call Trace:
[ 127.751122] [<ffffffff812b8c29>] cleaner_kthread+0xa9/0x120
[ 127.751200] [<ffffffff812b8b80>] ? write_dev_flush.part.107+0xc0/0xc0
[ 127.751289] [<ffffffff81069450>] kthread+0xc0/0xd0
[ 127.751354] [<ffffffff81069390>] ? kthread_create_on_node+0x130/0x130
[ 127.751444] [<ffffffff816976dc>] ret_from_fork+0x7c/0xb0
[ 127.751516] [<ffffffff81069390>] ? kthread_create_on_node+0x130/0x130
[ 127.751602] Code: 44 28 3f 85 c0 7f 83 31 d2 31 f6 4c 89 ff e8 f7 c5
fe ff eb 84 0f 1f 44 00 00 48 83 c4 18 31 c0 5b 41 5c 41 5d 41 5e 41 5f
5d c3 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 66 66 66 66 90 48
[ 127.752207] RIP [<ffffffff812c1611>]
btrfs_clean_old_snapshots+0x131/0x140
[ 127.752305] RSP <ffff8800687ede38>
[ 127.752371] ---[ end trace cc41fa39a41b468e ]---
[ 127.862825] btrfs: corrupt leaf, bad key order:
block=2837196627968,root=1, slot=121
[ 127.862938] ------------[ cut here ]------------
[ 127.863009] WARNING: at fs/btrfs/super.c:255
__btrfs_abort_transaction+0xdf/0x100()
[ 127.863110] Hardware name: System Product Name
[ 127.863171] btrfs: Transaction aborted
[ 127.863222] Modules linked in: usblp pl2303 usbserial hid_generic
usbhid hid usb_storage lp ppdev parport_pc parport snd_hda_codec_via
sp5100_tco acpi_cpufreq mperf freq_table kvm_amd kvm evdev radeon ttm
drm_kms_helper psmouse drm serio_raw agpgart i2c_algo_bit microcode
snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc i2c_piix4
snd_timer snd atl1e ohci_hcd via_rhine i2c_core shpchp soundcore
ehci_pci ehci_hcd mii wmi k10temp asus_atk0110 processor thermal_sys
hwmon button
[ 127.864073] Pid: 1347, comm: btrfs-transacti Tainted: G D 3.9.3 #1
[ 127.864167] Call Trace:
[ 127.864204] [<ffffffff8104614f>] warn_slowpath_common+0x7f/0xc0
[ 127.864285] [<ffffffff81046246>] warn_slowpath_fmt+0x46/0x50
[ 127.864370] [<ffffffff812962ef>] __btrfs_abort_transaction+0xdf/0x100
[ 127.864460] [<ffffffff812a71f2>] __btrfs_free_extent+0x242/0x870
[ 127.864543] [<ffffffff813046bc>] ? btrfs_merge_delayed_refs+0x1fc/0x3c0
[ 127.870518] [<ffffffff812ab59b>] run_clustered_refs+0x50b/0xc40
[ 127.876503] [<ffffffff81303813>] ? find_ref_head+0x83/0xf0
[ 127.882501] [<ffffffff812af6b0>] btrfs_run_delayed_refs+0xe0/0x570
[ 127.882503] [<ffffffff812bfb9a>] btrfs_commit_transaction+0xea/0xad0
[ 127.882505] [<ffffffff81069b90>] ? finish_wait+0x80/0x80
[ 127.882513] [<ffffffff812b8605>] transaction_kthread+0x1a5/0x220
[ 127.882517] [<ffffffff812b8460>] ?
btree_readpage_end_io_hook+0x2a0/0x2a0
[ 127.882520] [<ffffffff81069450>] kthread+0xc0/0xd0
[ 127.882521] [<ffffffff81069390>] ? kthread_create_on_node+0x130/0x130
[ 127.882523] [<ffffffff816976dc>] ret_from_fork+0x7c/0xb0
[ 127.882524] [<ffffffff81069390>] ? kthread_create_on_node+0x130/0x130
[ 127.882525] ---[ end trace cc41fa39a41b468f ]---
[ 127.882527] BTRFS error (device sdb) in __btrfs_free_extent:5394: IO
failure
[ 127.882528] btrfs: run_one_delayed_ref returned -5
[ 127.882529] BTRFS error (device sdb) in btrfs_run_delayed_refs:2565:
IO failure
Not that I've done anything other than a cursory check but it looks like
the read only data is fine.
Pete
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Hardware failure or btrfs issue?
2013-07-01 22:56 Hardware failure or btrfs issue? Peter Chant
@ 2013-07-02 7:29 ` Hugo Mills
2013-07-02 17:36 ` Peter Chant
0 siblings, 1 reply; 5+ messages in thread
From: Hugo Mills @ 2013-07-02 7:29 UTC (permalink / raw)
To: Peter Chant; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 1132 bytes --]
On Mon, Jul 01, 2013 at 11:56:30PM +0100, Peter Chant wrote:
> Sirs,
>
> my recently slowing file system is now going read only after trying
> a defrag or other operation. I'm wondering whether this is the
> result of a hardware failure or a btrfs or some other issue. Output
> of dmesg:
[snip]
> [ 127.862825] btrfs: corrupt leaf, bad key order:
> block=2837196627968,root=1, slot=121
[snip]
This is usually an indication that you have bad hardware -- I'd
suggest testing RAM, PSU, CPU in that order. I'm not sure what, if
anything, can be done to fix the error on the disk right now.
> Not that I've done anything other than a cursory check but it looks
> like the read only data is fine.
Might be a good idea to use that to refresh your backups, just in
case my prediction about the fixability is correct.
Hugo.
--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- "How deep will this sub go?" "Oh, she'll go all the way to ---
the bottom if we don't stop her."
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Hardware failure or btrfs issue?
2013-07-02 7:29 ` Hugo Mills
@ 2013-07-02 17:36 ` Peter Chant
2013-07-02 17:48 ` Hugo Mills
0 siblings, 1 reply; 5+ messages in thread
From: Peter Chant @ 2013-07-02 17:36 UTC (permalink / raw)
To: Hugo Mills, linux-btrfs
On 07/02/2013 08:29 AM, Hugo Mills wrote:
> This is usually an indication that you have bad hardware -- I'd
> suggest testing RAM, PSU, CPU in that order. I'm not sure what, if
> anything, can be done to fix the error on the disk right now.
Thanks, appreciated.
Hmm. I've got one stick of ram out of the machine due to testing as I
had some freezes last week.
If it were one of the RAM, PSU and CPU then I'm unsure why this IO issue
only surfaces on the HDD and not the SSD. I ordered a new HDD last
night, before reading your post. If its not the disk I'll go raid1. If
it is the disk then I'll probally find out.
>> Not that I've done anything other than a cursory check but it looks
>> like the read only data is fine.
> Might be a good idea to use that to refresh your backups, just in
> case my prediction about the fixability is correct.
Well, first option is to drop in the new disk, freshly format it and
copy the data across (not add it as a second disk). If that fails last
backup was wednesday. I've not done much of note since then apart from
try to fix the disk issues.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Hardware failure or btrfs issue?
2013-07-02 17:36 ` Peter Chant
@ 2013-07-02 17:48 ` Hugo Mills
2013-07-02 21:37 ` Peter Chant
0 siblings, 1 reply; 5+ messages in thread
From: Hugo Mills @ 2013-07-02 17:48 UTC (permalink / raw)
To: Peter Chant; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 1670 bytes --]
On Tue, Jul 02, 2013 at 06:36:48PM +0100, Peter Chant wrote:
> On 07/02/2013 08:29 AM, Hugo Mills wrote:
> >This is usually an indication that you have bad hardware -- I'd
> >suggest testing RAM, PSU, CPU in that order. I'm not sure what, if
> >anything, can be done to fix the error on the disk right now.
>
> Thanks, appreciated.
>
> Hmm. I've got one stick of ram out of the machine due to testing as
> I had some freezes last week.
So the damage probably happened then, if that stick is bad.
Filesystems have this irritating habit of remembering things done to
them across reboots. :)
Hugo.
> If it were one of the RAM, PSU and CPU then I'm unsure why this IO
> issue only surfaces on the HDD and not the SSD. I ordered a new HDD
> last night, before reading your post. If its not the disk I'll go
> raid1. If it is the disk then I'll probally find out.
>
> >>Not that I've done anything other than a cursory check but it looks
> >>like the read only data is fine.
> > Might be a good idea to use that to refresh your backups, just in
> >case my prediction about the fixability is correct.
>
> Well, first option is to drop in the new disk, freshly format it and
> copy the data across (not add it as a second disk). If that fails
> last backup was wednesday. I've not done much of note since then
> apart from try to fix the disk issues.
>
>
--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- The glass is neither half-full nor half-empty; it is twice as ---
large as it needs to be.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Hardware failure or btrfs issue?
2013-07-02 17:48 ` Hugo Mills
@ 2013-07-02 21:37 ` Peter Chant
0 siblings, 0 replies; 5+ messages in thread
From: Peter Chant @ 2013-07-02 21:37 UTC (permalink / raw)
To: Hugo Mills, linux-btrfs
On 07/02/2013 06:48 PM, Hugo Mills wrote:
> So the damage probably happened then, if that stick is bad.
> Filesystems have this irritating habit of remembering things done to
> them across reboots. :) Hugo.
The previous action to the defrag was to delete 48 hours worth of
hourly snapshots. I was wondering if the numerous snapshots were what
was making defrag so painfully slow. Not that I know anything about
btrfs internals, but I suspect that is major enough action to catch out
any random corruption if there was any. I think I'll restrict snapshots
to once or twice a day at most unless that really should cause no issue.
Pete
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-07-02 21:35 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-01 22:56 Hardware failure or btrfs issue? Peter Chant
2013-07-02 7:29 ` Hugo Mills
2013-07-02 17:36 ` Peter Chant
2013-07-02 17:48 ` Hugo Mills
2013-07-02 21:37 ` Peter Chant
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).