From: Neil Brown <neilb@suse.de>
To: Michael Tokarev <mjt@tls.msk.ru>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: read errors (in superblock?) aren't fixed by md?
Date: Sat, 13 Nov 2010 06:12:27 +1100 [thread overview]
Message-ID: <20101113061227.44da7788@notabene> (raw)
In-Reply-To: <4CDD47A7.9010501@msgid.tls.msk.ru>
On Fri, 12 Nov 2010 16:56:55 +0300
Michael Tokarev <mjt@tls.msk.ru> wrote:
> I noticed a few read errors in dmesg, on drives
> which are parts of a raid10 array:
>
> sd 0:0:13:0: [sdf] Unhandled sense code
> sd 0:0:13:0: [sdf] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> sd 0:0:13:0: [sdf] Sense Key : Medium Error [current]
> Info fld=0x880c1d9
> sd 0:0:13:0: [sdf] Add. Sense: Unrecovered read error - recommend rewrite the data
> sd 0:0:13:0: [sdf] CDB: Read(10): 28 00 08 80 c0 bf 00 01 80 00
> end_request: I/O error, dev sdf, sector 142655961
>
> sd 0:0:11:0: [sdd] Unhandled sense code
> sd 0:0:11:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> sd 0:0:11:0: [sdd] Sense Key : Medium Error [current]
> Info fld=0x880c3e5
> sd 0:0:11:0: [sdd] Add. Sense: Unrecovered read error - recommend rewrite the data
> sd 0:0:11:0: [sdd] CDB: Read(10): 28 00 08 80 c2 3f 00 02 00 00
> end_request: I/O error, dev sdd, sector 142656485
>
> Both sdf and sdd are parts of the same (raid10) array,
> and this array is the only usage for these drives (i.e.,
> there's nothing else reading them). Both the mentioned
> locations are near the end of the only partition on
> these drives:
>
> # partition table of /dev/sdf
> unit: sectors
> /dev/sdf1 : start= 63, size=142657137, Id=83
>
> (the same partition table is on /dev/sdd too).
>
> Sector 142657200 is the start of the next (non-existing)
> partition, so the last sector of the first partition is
> 142657199.
>
> Now, we've read errors on sectors 142655961 (sdf)
> and 142656485 (sdd), which are 1239 and 715 sectors
> before the end of the partition, respectively.
>
> The array is this:
>
> # mdadm -E /dev/sdf1
> /dev/sdf1:
> Magic : a92b4efc
> Version : 00.90.00
> UUID : 1c49b395:293761c8:4113d295:43412a46
> Creation Time : Sun Jun 27 04:37:12 2010
> Raid Level : raid10
> Used Dev Size : 71328256 (68.02 GiB 73.04 GB)
> Array Size : 499297792 (476.17 GiB 511.28 GB)
> Raid Devices : 14
> Total Devices : 14
> Preferred Minor : 11
>
> Update Time : Fri Nov 12 16:55:06 2010
> State : clean
> Internal Bitmap : present
> Active Devices : 14
> Working Devices : 14
> Failed Devices : 0
> Spare Devices : 0
> Checksum : 104a3529 - correct
> Events : 16790
>
> Layout : near=2, far=1
> Chunk Size : 256K
>
> Number Major Minor RaidDevice State
> this 10 8 81 10 active sync /dev/sdf1
> 0 0 8 1 0 active sync /dev/sda1
> 1 1 8 113 1 active sync /dev/sdh1
> 2 2 8 17 2 active sync /dev/sdb1
> 3 3 8 129 3 active sync /dev/sdi1
> 4 4 8 33 4 active sync /dev/sdc1
> 5 5 8 145 5 active sync /dev/sdj1
> 6 6 8 49 6 active sync /dev/sdd1
> 7 7 8 161 7 active sync /dev/sdk1
> 8 8 8 65 8 active sync /dev/sde1
> 9 9 8 177 9 active sync /dev/sdl1
> 10 10 8 81 10 active sync /dev/sdf1
> 11 11 8 193 11 active sync /dev/sdm1
> 12 12 8 97 12 active sync /dev/sdg1
> 13 13 8 209 13 active sync /dev/sdn1
>
>
> What's wrong with these read errors? I just verified -
> the error persists, i.e. reading the mentioned sectors
> using dd produces the same errors again, so there were
> no re-writes there.
>
> Can md handle this situation gracefully?
These sectors would be in the internal bitmap which starts at 142657095
and ends before 142657215.
The bitmap is read from just one device when the array is assembled, then
written to all devices when it is modified.
I'm not sure off-hand exactly how md would handle read errors. I would
expect it to just disable the bitmap, but it doesn't appear to be doing
that... odd. I would need to investigate more.
You should be able to get md to over-write the area by removing the internal
bitmap and adding it back (with --grow --bitmap=none / --grow
--bitmap=internal).
NeilBrown
>
> Thanks!
>
> /mjt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-11-12 19:12 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-12 13:56 read errors (in superblock?) aren't fixed by md? Michael Tokarev
2010-11-12 19:12 ` Neil Brown [this message]
2010-11-16 8:58 ` Michael Tokarev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101113061227.44da7788@notabene \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=mjt@tls.msk.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).