Re: Troubleshooting "Buffer I/O error" on reading md device

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.com>
To: RQM <rqm@protonmail.com>
Cc: "linux-raid\\@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: Re: Troubleshooting "Buffer I/O error" on reading md device
Date: Wed, 03 Jan 2018 08:27:59 +1100	[thread overview]
Message-ID: <87r2r8dk80.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <XUhdgUHMsoit8A9Qw13P1q6NQUxsgqNZCsgx6us_8kHu50GPOvQoOBwP-ryQz54CBi6Js7J2xwC8jgKQ5geXi5AdVLT27YlucQ4tWH-8xlM=@protonmail.com>

[-- Attachment #1: Type: text/plain, Size: 3662 bytes --]

On Tue, Jan 02 2018, RQM wrote:

> Hello,
>
> thanks for the quick and helpful responses! Answers inline:
>
> > Step one is confirm that it is easy to reproduce.
>> Does
>> dd if=/dev/md0 bs=4K skip=1598030208 count=1 of=/dev/null
>>
>> trigger the message reliably?
>> To check that "4K" is the correct blocksize, run
>> blockdev --getbsz /dev/md0
>>
>> use whatever number if gives as 'bs='.
>
>
> blockdev does indeed report a blocksize of 4096, and the dd line does reliably trigger
> dd: error reading '/dev/md0': Input/output error
> and the same line in dmesg as before.
>
>> Once you can reproduce with minimal IO, do
>> echo file:raid5.c +p > /sys/kernel/debug/dynamic_debug/control
>>repeat experiment
>>
>>echo file:raid5.c -p > /sys/kernel/debug/dynamic_debug/control
>>
>> and report the messages that appear in 'dmesg'.
>
> I had to replace the colon with a space in those two lines (otherwise I would get "bash: echo: write error: Invalid argument"), but after that, this is what I got in dmesg:
> https://paste.ubuntu.com/26305369/

[Tue Jan  2 11:14:47 2018] locked=0 uptodate=0 to_read=1 to_write=0 failed=2 failed_num=3,2

So for this stripe. Two devices appear to be failed: 3 and 2.
As the two devices clearly are thought to be working there must be a bad
block recorded.

>
>> Also report "mdadm -E" of each member device, and kernel version (though
>> I see that is in the serverfault report :  4.9.30-2+deb9u5).
>
> mdadm -E says: https://paste.ubuntu.com/26305379/

I needed "mdadm -E" the components of the array, so the partitions
rather than the whole devices. e.g. /dev/sdb1, not /dev/sdb.

This will show a non-empty bad block list on at least two devices.

You can remove the bad block by over-writing it.
  dd if=/dev/zero of=/dev/md0 bs=4K seek=1598030208 count=1
though that might corrupt some file containing the block.

(note "seek" seeks in the output file, "skip" skips over the input
file).

How did the bad block get there?
A possible scenario is:
 - A device fails and is removed from array
 - read error occurs on another device.  Rather than failing the whole
   device, md records that block as bad.
 - failed device is replaced (or found to be a cabling problem) and
   recovered.  Due to the bad block the stripe cannot be recovered,
   so a bad block is recorded in the new device.

If the read error was really a cabling problem, then the original data
might still be there.  If it is, you could recover it and write it back
to the array rather then writing from /dev/zero.
Finding out which file the failed block is part of is probably possible,
but not necessarily easy.  If you want to try, the first step is
reporting what filesystem is on md0.  If it is ext4, then debugfs can
help.  If something else - I don't know.

NeilBrown

 

> The kernel has been updated between the serverfault post and my first mail to this list to 4.9.65-3+deb9u1. No changes since.
>
>>
>> Then run
>> blktrace /dev/md0 /dev/sd[acdef]
>> in one window while reproducing the error again in another window.
>> Then interrupt the blktrace.  This will produce several blocktrace*
>> files.  create a tar.gz of these and put them somewhere that I can get
>> them - hopefully they won't be too big.
>
> I had to adjust the last blktrace argument to /dev/sd[b-f] since after the last reboot the names of the drives have changed, but here's the output:
> https://filebin.ca/3mnjUz1OIXqm/blktrace-out.tar.gz
> I also included the blktrace terminal output in there.
>
> Thank you so much for the effort! Please let me know if you need anything.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

next prev parent reply	other threads:[~2018-01-02 21:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-02  2:46 Troubleshooting "Buffer I/O error" on reading md device RQM
2018-01-02  3:13 ` Reindl Harald
2018-01-02  4:28 ` NeilBrown
2018-01-02 10:40   ` RQM
2018-01-02 21:27     ` NeilBrown [this message]
2018-01-02 22:30       ` Roger Heflin
2018-01-04 14:45       ` RQM
2018-01-05  1:05         ` NeilBrown
2018-01-05 12:55           ` RQM
2018-01-13 12:18             ` RQM
2018-02-02  1:55               ` NeilBrown
2022-11-01 23:49               ` Darshaka Pathirana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r2r8dk80.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=rqm@protonmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).