linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zak Kohler <y2k@y2kbugger.com>
To: "Lakshmipathi.G" <lakshmipathi.g@gmail.com>
Cc: btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs send yields "ERROR: send ioctl failed with -5: Input/output error"
Date: Tue, 24 Oct 2017 21:52:38 -0400	[thread overview]
Message-ID: <CAD8FQQ3boTj97yu-F8CiFBvR-EfD+1chcSo=VzyAxwO7GpomEg@mail.gmail.com> (raw)
In-Reply-To: <CAD8FQQ20m3sgLKc_utf2xb-J6bhf1W58mAkEFc2jyJy7-bH3+w@mail.gmail.com>

I apologize for the bad line wrapping on the last post...will be
setting up mutt soon.

This is the final result for the offline scrub:
Doing offline scrub [O] [681/683]
Scrub result:
Tree bytes scrubbed: 5234491392
Tree extents scrubbed: 638975
Data bytes scrubbed: 4353723572224
Data extents scrubbed: 374300
Data bytes without csum: 533200896
Read error: 0
Verify error: 0
Csum error: 175

The offline scrub apparently corrected some metadata extents while
scanning /dev/sdn


I also ran the online scrub directly on the /dev/sdn, "0 errors":

$ btrfs scrub status /dev/sdn
scrub status for 88406942-e3e1-42c6-ad71-e23bb315caa7
        scrub started at Tue Oct 24 06:55:12 2017 and finished after 01:52:44
        total bytes scrubbed: 677.35GiB with 0 errors

The csum mismatches are still missed by the online scrub when choosing
a single <device>. Now I am doing offline scrub on the other devices
to see if they are clean.

$ lsblk -o +SERIAL
NAME      MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT SERIAL
sdh         8:112  0  1.8T  0 disk             WD-WMAZA370XXXX
sdi         8:128  0  1.8T  0 disk             WD-WCAZA569XXXX
sdn         8:208  0  1.8T  0 disk             WD-WCAZA580XXXX

$ btrfs scrub start --offline --progress /dev/sdh
ERROR: data at bytenr 5365456896 ...
ERROR: extent 5341712384 ...
...

One thing to note is that a /dev/sdh is also having csum errors
detected despite it having never been mentioned dmesg. I understand
that you may have the ability to run two offline checks at once but
the error message I get is slightly misleading.

$ btrfs scrub start --offline --progress /dev/sdi
ERROR: cannot open device '/dev/sdn': Device or resource busy
ERROR: cannot open file system

I get an error about sdn when the device I am trying to scan is sdi,
and the device that is currently being scanned is sdh.

On Tue, Oct 24, 2017 at 2:00 AM, Zak Kohler <y2k@y2kbugger.com> wrote:
> Yes, it is finding much more than just one error.
>
> From dmesg
> [89520.441354] BTRFS warning (device sdn): csum failed ino 4708 off
> 27529216 csum 2615801759 expected csum 874979996
>
> $ sudo btrfs scrub start --offline --progress /dev/sdn
> ERROR: data at bytenr 68431499264 mirror 1 csum mismatch, have
> 0x5aa0d40f expect 0xd4a15873
> ERROR: extent 68431474688 len 14467072 CORRUPTED, all mirror(s)
> corrupted, can't be repaired
> ERROR: data at bytenr 83646357504 mirror 1 csum mismatch, have
> 0xfc0baabe expect 0x7f9cb681
> ERROR: extent 83519741952 len 134217728 CORRUPTED, all mirror(s)
> corrupted, can't be repaired
> ERROR: data at bytenr 121936633856 mirror 1 csum mismatch, have
> 0x507016a5 expect 0x50609afe
> ERROR: extent 121858334720 len 134217728 CORRUPTED, all mirror(s)
> corrupted, can't be repaired
> ERROR: data at bytenr 144872591360 mirror 1 csum mismatch, have
> 0x33964d73 expect 0xf9937032
> ERROR: extent 144822386688 len 61231104 CORRUPTED, all mirror(s)
> corrupted, can't be repaired
> ERROR: data at bytenr 167961075712 mirror 1 csum mismatch, have
> 0xf43bd0e3 expect 0x5be589bb
> ERROR: extent 167950999552 len 27537408 CORRUPTED, all mirror(s)
> corrupted, can't be repaired
> ERROR: data at bytenr 175643619328 mirror 1 csum mismatch, have
> 0x1e168ca1 expect 0xd413b1e0
> ERROR: data at bytenr 175643754496 mirror 1 csum mismatch, have
> 0x6cfdc8ae expect 0xa6f8f5ef
> ERROR: extent 175640539136 len 6381568 CORRUPTED, all mirror(s)
> corrupted, can't be repaired
> ERROR: data at bytenr 183316750336 mirror 1 csum mismatch, have
> 0x145bdf76 expect 0x7390565e
> .....
> and the list goes on.
>
>
> Questions:
> 1. Using "find /mnt -inum 4708" I can link the dmesg to a specific
> file. Is there a
> way link the the --offline ERRORs above to the inode?
>
> 2. How could do "btrfs device stats /mnt" and normal full scrub fail
> to detect the csum errors?
>
> 3. Do these errors appear to be hardware failure (despite pristine
> SMART), user error on
> volume creation/mounting, or an actual btrfs issue? I feel that the
> need for question #1
> indicates a problem with btrfs regardless of whether there is a real
> hardware failure or not.
>
>
> Next I will try an online scrub of only the sdn device, as before I
> was running the full filesystem scrub.
>
> On Tue, Oct 24, 2017 at 12:52 AM, Lakshmipathi.G
> <lakshmipathi.g@gmail.com> wrote:
>>> Does anyone know why scrub did not catch these errors that show up in dmesg?
>>
>> Can you try offline scrub from this repo
>> https://github.com/gujx2017/btrfs-progs/tree/offline_scrub and see
>> whether it
>> detects the issue?  "btrfs scrub start --offline <dev>"
>>
>>
>> ----
>> Cheers,
>> Lakshmipathi.G
>> http://www.giis.co.in http://www.webminal.org

  reply	other threads:[~2017-10-25  1:58 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-23  4:25 btrfs send yields "ERROR: send ioctl failed with -5: Input/output error" Zak Kohler
2017-10-24  0:23 ` Zak Kohler
2017-10-24  4:52   ` Lakshmipathi.G
2017-10-24  6:00     ` Zak Kohler
2017-10-25  1:52       ` Zak Kohler [this message]
2017-10-25  3:43         ` Lakshmipathi.G
2017-10-26  2:34           ` Zak Kohler
2017-10-29 19:05             ` Chris Murphy
2017-10-30  1:57               ` Zak Kohler
2017-10-30  4:09                 ` Duncan
2017-10-30 14:36                   ` Zak Kohler
2017-10-31  2:33                   ` Duncan
2017-11-02 12:23                     ` Zak Kohler
2017-10-30 18:52                 ` Chris Murphy
2017-11-06 20:04               ` Chris Murphy
     [not found]                 ` <CAD8FQQ3XSsLt4XYdeMg7r3oX9WUerW27f8RMuKurjL4cpY8=1g@mail.gmail.com>
2017-11-11 19:11                   ` Chris Murphy
2017-10-30  4:07             ` Lakshmipathi.G

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAD8FQQ3boTj97yu-F8CiFBvR-EfD+1chcSo=VzyAxwO7GpomEg@mail.gmail.com' \
    --to=y2k@y2kbugger.com \
    --cc=lakshmipathi.g@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).