From: Gaardiolor <gaardiolor@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Corrupted data, failed drive(s)
Date: Thu, 3 Jun 2021 18:50:25 +0200 [thread overview]
Message-ID: <cf633d62-73ab-1ce2-f31c-a4a8407a38b4@gmail.com> (raw)
Hello,
I could use some help with some issues I'm having with my drives.
I've got 4 disks in raid1.
--
[17:59:07]root@kiwi:/storage/samba/storage# btrfs filesystem df /storage/
Data, RAID1: total=4.39TiB, used=4.38TiB
System, RAID1: total=32.00MiB, used=720.00KiB
Metadata, RAID1: total=6.00GiB, used=4.66GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
[17:59:10]root@kiwi:/storage/samba/storage# btrfs filesystem show
Label: none uuid: 8ce9e167-57ea-4cf8-8678-3049ba028c12
Total devices 4 FS bytes used 4.38TiB
devid 1 size 3.64TiB used 3.10TiB path /dev/sdc
devid 2 size 3.64TiB used 3.14TiB path /dev/sdb
devid 3 size 1.82TiB used 1.32TiB path /dev/sda
devid 4 size 1.82TiB used 1.21TiB path /dev/sdd
--
I'm having some issues with faulty disk(s). /dev/sdd is bad for sure,
SMART is complaining.
--
# smartctl -aq errorsonly /dev/sdd
ATA Error Count: 108 (device log contains only the most recent five errors)
Error 108 occurred at disk power-on lifetime: 47563 hours (1981 days +
19 hours)
Error 107 occurred at disk power-on lifetime: 47563 hours (1981 days +
19 hours)
Error 106 occurred at disk power-on lifetime: 47563 hours (1981 days +
19 hours)
Error 105 occurred at disk power-on lifetime: 47563 hours (1981 days +
19 hours)
Error 104 occurred at disk power-on lifetime: 47563 hours (1981 days +
19 hours)
--
Also in /var/log/messages:
--
Jun 3 17:47:21 kiwi smartd[1112]: Device: /dev/sdd [SAT], 3088
Currently unreadable (pending) sectors
Jun 3 17:47:21 kiwi smartd[1112]: Device: /dev/sdd [SAT], 3088 Offline
uncorrectable sectors
--
However, the other disks also generate errors.
--
[18:00:35]root@kiwi:/storage/samba/storage# btrfs device stats /dev/sda
[/dev/sda].write_io_errs 0
[/dev/sda].read_io_errs 0
[/dev/sda].flush_io_errs 0
[/dev/sda].corruption_errs 408
[/dev/sda].generation_errs 0
[18:00:39]root@kiwi:/storage/samba/storage# btrfs device stats /dev/sdb
[/dev/sdb].write_io_errs 0
[/dev/sdb].read_io_errs 0
[/dev/sdb].flush_io_errs 0
[/dev/sdb].corruption_errs 322
[/dev/sdb].generation_errs 0
[18:00:42]root@kiwi:/storage/samba/storage# btrfs device stats /dev/sdc
[/dev/sdc].write_io_errs 0
[/dev/sdc].read_io_errs 0
[/dev/sdc].flush_io_errs 0
[/dev/sdc].corruption_errs 1283
[/dev/sdc].generation_errs 0
[18:00:43]root@kiwi:/storage/samba/storage# btrfs device stats /dev/sdd
[/dev/sdd].write_io_errs 0
[/dev/sdd].read_io_errs 1582
[/dev/sdd].flush_io_errs 0
[/dev/sdd].corruption_errs 1310
[/dev/sdd].generation_errs 0
-
/dev/sdd is the only one with read_io_errs.
I've tried unpacking a .tar.gz from /storage to another filesystem, but
the tar.gz was obviously corrupt. Very strange filenames which were,
because of the name, pretty difficult to remove. I will not post the
filenames here, it'd probably crash the internet. I'm also getting:
--
gzip: stdin: invalid compressed data--crc error
tar: Child returned status 1
tar: Error is not recoverable: exiting now
--
I can't btrfs remove /dev/sdd . The command below ran for a while (I
could see the allocated space of /dev/sdd decrease with btrfs fi us
/storage/), but then errored:
--
root@kiwi:~# btrfs device remove /dev/sdd /storage/
ERROR: error removing device '/dev/sdd': Input/output error
--
I have a couple of questions:
1) Unpacking some .tar.gz files from /storage resulted in files with
weird names, data was unusable. But, it's raid1. Why is my data corrupt,
I've read that BTRFS checks the checksum on read ?
2) Are all my 4 drives faulty because of the corruption_errs ? If so, 4
faulty drives is somewhat unusual. Any other possibilities ?
3) Given that
- I can't 'btrfs device remove' the device
- I do not have a free SATA port
- I'd prefer a method that doesn't unnecessarily take a very long time
What's the best way to migrate to a different device ? I'm guessing,
after doing some reading:
- shutdown
- physically remove faulty disk
- boot
- verify /dev/sdd is missing, and that I've removed the correct disk
- shutdown
- connect new disk, it will also be /dev/sdd, because I have no other
free SATA port
- boot
- check that the new disk is /dev/sdd
- mount -o degraded /dev/sda /storage
- btrfs replace start 4 /dev/sdd /storage
- btrfs balance /storage
Is this correct, should this also check / fix errors, if not, what's the
best approach.. Thanks!
Gaardiolor
next reply other threads:[~2021-06-03 16:51 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-03 16:50 Gaardiolor [this message]
2021-06-03 22:37 ` Corrupted data, failed drive(s) Chris Murphy
2021-06-04 9:27 ` Gaardiolor
2021-06-04 23:22 ` Chris Murphy
2021-06-05 9:23 ` Graham Cobb
2021-06-06 16:14 ` Gaardiolor
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cf633d62-73ab-1ce2-f31c-a4a8407a38b4@gmail.com \
--to=gaardiolor@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).