linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gaardiolor <gaardiolor@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Corrupted data, failed drive(s)
Date: Thu, 3 Jun 2021 18:50:25 +0200	[thread overview]
Message-ID: <cf633d62-73ab-1ce2-f31c-a4a8407a38b4@gmail.com> (raw)

Hello,

I could use some help with some issues I'm having with my drives.

I've got 4 disks in raid1.
--
[17:59:07]root@kiwi:/storage/samba/storage# btrfs filesystem df /storage/
Data, RAID1: total=4.39TiB, used=4.38TiB
System, RAID1: total=32.00MiB, used=720.00KiB
Metadata, RAID1: total=6.00GiB, used=4.66GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

[17:59:10]root@kiwi:/storage/samba/storage# btrfs filesystem show
Label: none  uuid: 8ce9e167-57ea-4cf8-8678-3049ba028c12
         Total devices 4 FS bytes used 4.38TiB
         devid    1 size 3.64TiB used 3.10TiB path /dev/sdc
         devid    2 size 3.64TiB used 3.14TiB path /dev/sdb
         devid    3 size 1.82TiB used 1.32TiB path /dev/sda
         devid    4 size 1.82TiB used 1.21TiB path /dev/sdd
--

I'm having some issues with faulty disk(s). /dev/sdd is bad for sure, 
SMART is complaining.
--
# smartctl -aq errorsonly /dev/sdd
ATA Error Count: 108 (device log contains only the most recent five errors)
Error 108 occurred at disk power-on lifetime: 47563 hours (1981 days + 
19 hours)
Error 107 occurred at disk power-on lifetime: 47563 hours (1981 days + 
19 hours)
Error 106 occurred at disk power-on lifetime: 47563 hours (1981 days + 
19 hours)
Error 105 occurred at disk power-on lifetime: 47563 hours (1981 days + 
19 hours)
Error 104 occurred at disk power-on lifetime: 47563 hours (1981 days + 
19 hours)
--

Also in /var/log/messages:
--
Jun  3 17:47:21 kiwi smartd[1112]: Device: /dev/sdd [SAT], 3088 
Currently unreadable (pending) sectors
Jun  3 17:47:21 kiwi smartd[1112]: Device: /dev/sdd [SAT], 3088 Offline 
uncorrectable sectors
--

However, the other disks also generate errors.
--
[18:00:35]root@kiwi:/storage/samba/storage# btrfs device stats /dev/sda
[/dev/sda].write_io_errs    0
[/dev/sda].read_io_errs     0
[/dev/sda].flush_io_errs    0
[/dev/sda].corruption_errs  408
[/dev/sda].generation_errs  0
[18:00:39]root@kiwi:/storage/samba/storage# btrfs device stats /dev/sdb
[/dev/sdb].write_io_errs    0
[/dev/sdb].read_io_errs     0
[/dev/sdb].flush_io_errs    0
[/dev/sdb].corruption_errs  322
[/dev/sdb].generation_errs  0
[18:00:42]root@kiwi:/storage/samba/storage# btrfs device stats /dev/sdc
[/dev/sdc].write_io_errs    0
[/dev/sdc].read_io_errs     0
[/dev/sdc].flush_io_errs    0
[/dev/sdc].corruption_errs  1283
[/dev/sdc].generation_errs  0
[18:00:43]root@kiwi:/storage/samba/storage# btrfs device stats /dev/sdd
[/dev/sdd].write_io_errs    0
[/dev/sdd].read_io_errs     1582
[/dev/sdd].flush_io_errs    0
[/dev/sdd].corruption_errs  1310
[/dev/sdd].generation_errs  0
-

/dev/sdd is the only one with read_io_errs.

I've tried unpacking a .tar.gz from /storage to another filesystem, but 
the tar.gz was obviously corrupt. Very strange filenames which were, 
because of the name, pretty difficult to remove. I will not post the 
filenames here, it'd probably crash the internet. I'm also getting:
--
gzip: stdin: invalid compressed data--crc error
tar: Child returned status 1
tar: Error is not recoverable: exiting now
--

I can't btrfs remove /dev/sdd . The command below ran for a while (I 
could see the allocated space of /dev/sdd decrease with btrfs fi us 
/storage/), but then errored:
--
root@kiwi:~# btrfs device remove /dev/sdd /storage/
  ERROR: error removing device '/dev/sdd': Input/output error
--


I have a couple of questions:

1) Unpacking some .tar.gz files from /storage resulted in files with 
weird names, data was unusable. But, it's raid1. Why is my data corrupt, 
I've read that BTRFS checks the checksum on read ?
2) Are all my 4 drives faulty because of the corruption_errs ? If so, 4 
faulty drives is somewhat unusual. Any other possibilities ?
3) Given that
- I can't 'btrfs device remove' the device
- I do not have a free SATA port
- I'd prefer a method that doesn't unnecessarily take a very long time

What's the best way to migrate to a different device ? I'm guessing, 
after doing some reading:
- shutdown
- physically remove faulty disk
- boot
- verify /dev/sdd is missing, and that I've removed the correct disk
- shutdown
- connect new disk, it will also be /dev/sdd, because I have no other 
free SATA port
- boot
- check that the new disk is /dev/sdd
- mount -o degraded /dev/sda /storage
- btrfs replace start 4 /dev/sdd /storage
- btrfs balance /storage

Is this correct, should this also check / fix errors, if not, what's the 
best approach.. Thanks!

Gaardiolor

             reply	other threads:[~2021-06-03 16:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-03 16:50 Gaardiolor [this message]
2021-06-03 22:37 ` Corrupted data, failed drive(s) Chris Murphy
2021-06-04  9:27   ` Gaardiolor
2021-06-04 23:22     ` Chris Murphy
2021-06-05  9:23       ` Graham Cobb
2021-06-06 16:14       ` Gaardiolor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cf633d62-73ab-1ce2-f31c-a4a8407a38b4@gmail.com \
    --to=gaardiolor@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).