Re: read errors (in superblock?) aren't fixed by md?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Neil Brown <neilb@suse.de>
To: Michael Tokarev <mjt@tls.msk.ru>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: read errors (in superblock?) aren't fixed by md?
Date: Sat, 13 Nov 2010 06:12:27 +1100	[thread overview]
Message-ID: <20101113061227.44da7788@notabene> (raw)
In-Reply-To: <4CDD47A7.9010501@msgid.tls.msk.ru>

On Fri, 12 Nov 2010 16:56:55 +0300
Michael Tokarev <mjt@tls.msk.ru> wrote:

> I noticed a few read errors in dmesg, on drives
> which are parts of a raid10 array:
> 
> sd 0:0:13:0: [sdf] Unhandled sense code
> sd 0:0:13:0: [sdf] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> sd 0:0:13:0: [sdf] Sense Key : Medium Error [current]
> Info fld=0x880c1d9
> sd 0:0:13:0: [sdf] Add. Sense: Unrecovered read error - recommend rewrite the data
> sd 0:0:13:0: [sdf] CDB: Read(10): 28 00 08 80 c0 bf 00 01 80 00
> end_request: I/O error, dev sdf, sector 142655961
> 
> sd 0:0:11:0: [sdd] Unhandled sense code
> sd 0:0:11:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> sd 0:0:11:0: [sdd] Sense Key : Medium Error [current]
> Info fld=0x880c3e5
> sd 0:0:11:0: [sdd] Add. Sense: Unrecovered read error - recommend rewrite the data
> sd 0:0:11:0: [sdd] CDB: Read(10): 28 00 08 80 c2 3f 00 02 00 00
> end_request: I/O error, dev sdd, sector 142656485
> 
> Both sdf and sdd are parts of the same (raid10) array,
> and this array is the only usage for these drives (i.e.,
> there's nothing else reading them).  Both the mentioned
> locations are near the end of the only partition on
> these drives:
> 
> # partition table of /dev/sdf
> unit: sectors
> /dev/sdf1 : start=       63, size=142657137, Id=83
> 
> (the same partition table is on /dev/sdd too).
> 
> Sector 142657200 is the start of the next (non-existing)
> partition, so the last sector of the first partition is
> 142657199.
> 
> Now, we've read errors on sectors 142655961 (sdf)
> and 142656485 (sdd), which are 1239 and 715 sectors
> before the end of the partition, respectively.
> 
> The array is this:
> 
> # mdadm -E /dev/sdf1
> /dev/sdf1:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : 1c49b395:293761c8:4113d295:43412a46
>   Creation Time : Sun Jun 27 04:37:12 2010
>      Raid Level : raid10
>   Used Dev Size : 71328256 (68.02 GiB 73.04 GB)
>      Array Size : 499297792 (476.17 GiB 511.28 GB)
>    Raid Devices : 14
>   Total Devices : 14
> Preferred Minor : 11
> 
>     Update Time : Fri Nov 12 16:55:06 2010
>           State : clean
> Internal Bitmap : present
>  Active Devices : 14
> Working Devices : 14
>  Failed Devices : 0
>   Spare Devices : 0
>        Checksum : 104a3529 - correct
>          Events : 16790
> 
>          Layout : near=2, far=1
>      Chunk Size : 256K
> 
>       Number   Major   Minor   RaidDevice State
> this    10       8       81       10      active sync   /dev/sdf1
>    0     0       8        1        0      active sync   /dev/sda1
>    1     1       8      113        1      active sync   /dev/sdh1
>    2     2       8       17        2      active sync   /dev/sdb1
>    3     3       8      129        3      active sync   /dev/sdi1
>    4     4       8       33        4      active sync   /dev/sdc1
>    5     5       8      145        5      active sync   /dev/sdj1
>    6     6       8       49        6      active sync   /dev/sdd1
>    7     7       8      161        7      active sync   /dev/sdk1
>    8     8       8       65        8      active sync   /dev/sde1
>    9     9       8      177        9      active sync   /dev/sdl1
>   10    10       8       81       10      active sync   /dev/sdf1
>   11    11       8      193       11      active sync   /dev/sdm1
>   12    12       8       97       12      active sync   /dev/sdg1
>   13    13       8      209       13      active sync   /dev/sdn1
> 
> 
> What's wrong with these read errors?  I just verified -
> the error persists, i.e. reading the mentioned sectors
> using dd produces the same errors again, so there were
> no re-writes there.
> 
> Can md handle this situation gracefully?

These sectors would be in the internal bitmap which starts at 142657095
and ends before 142657215.

The bitmap is read from just one device when the array is assembled, then
written to all devices when it is modified.

I'm not sure off-hand exactly how md would handle read errors.  I would
expect it to just disable the bitmap, but it doesn't appear to be doing
that... odd.  I would need to investigate more.

You should be able to get md to over-write the area by removing the internal
bitmap and adding it back (with --grow --bitmap=none / --grow
--bitmap=internal).

NeilBrown


> 
> Thanks!
> 
> /mjt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2010-11-12 19:12 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-12 13:56 read errors (in superblock?) aren't fixed by md? Michael Tokarev
2010-11-12 19:12 ` Neil Brown [this message]
2010-11-16  8:58   ` Michael Tokarev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101113061227.44da7788@notabene \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=mjt@tls.msk.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).