read errors (in superblock?) aren't fixed by md?

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* read errors (in superblock?) aren't fixed by md?
@ 2010-11-12 13:56 Michael Tokarev
  2010-11-12 19:12 ` Neil Brown
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Tokarev @ 2010-11-12 13:56 UTC (permalink / raw)
  To: linux-raid

I noticed a few read errors in dmesg, on drives
which are parts of a raid10 array:

sd 0:0:13:0: [sdf] Unhandled sense code
sd 0:0:13:0: [sdf] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 0:0:13:0: [sdf] Sense Key : Medium Error [current]
Info fld=0x880c1d9
sd 0:0:13:0: [sdf] Add. Sense: Unrecovered read error - recommend rewrite the data
sd 0:0:13:0: [sdf] CDB: Read(10): 28 00 08 80 c0 bf 00 01 80 00
end_request: I/O error, dev sdf, sector 142655961

sd 0:0:11:0: [sdd] Unhandled sense code
sd 0:0:11:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 0:0:11:0: [sdd] Sense Key : Medium Error [current]
Info fld=0x880c3e5
sd 0:0:11:0: [sdd] Add. Sense: Unrecovered read error - recommend rewrite the data
sd 0:0:11:0: [sdd] CDB: Read(10): 28 00 08 80 c2 3f 00 02 00 00
end_request: I/O error, dev sdd, sector 142656485

Both sdf and sdd are parts of the same (raid10) array,
and this array is the only usage for these drives (i.e.,
there's nothing else reading them).  Both the mentioned
locations are near the end of the only partition on
these drives:

# partition table of /dev/sdf
unit: sectors
/dev/sdf1 : start=       63, size=142657137, Id=83

(the same partition table is on /dev/sdd too).

Sector 142657200 is the start of the next (non-existing)
partition, so the last sector of the first partition is
142657199.

Now, we've read errors on sectors 142655961 (sdf)
and 142656485 (sdd), which are 1239 and 715 sectors
before the end of the partition, respectively.

The array is this:

# mdadm -E /dev/sdf1
/dev/sdf1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 1c49b395:293761c8:4113d295:43412a46
  Creation Time : Sun Jun 27 04:37:12 2010
     Raid Level : raid10
  Used Dev Size : 71328256 (68.02 GiB 73.04 GB)
     Array Size : 499297792 (476.17 GiB 511.28 GB)
   Raid Devices : 14
  Total Devices : 14
Preferred Minor : 11

    Update Time : Fri Nov 12 16:55:06 2010
          State : clean
Internal Bitmap : present
 Active Devices : 14
Working Devices : 14
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 104a3529 - correct
         Events : 16790

         Layout : near=2, far=1
     Chunk Size : 256K

      Number   Major   Minor   RaidDevice State
this    10       8       81       10      active sync   /dev/sdf1
   0     0       8        1        0      active sync   /dev/sda1
   1     1       8      113        1      active sync   /dev/sdh1
   2     2       8       17        2      active sync   /dev/sdb1
   3     3       8      129        3      active sync   /dev/sdi1
   4     4       8       33        4      active sync   /dev/sdc1
   5     5       8      145        5      active sync   /dev/sdj1
   6     6       8       49        6      active sync   /dev/sdd1
   7     7       8      161        7      active sync   /dev/sdk1
   8     8       8       65        8      active sync   /dev/sde1
   9     9       8      177        9      active sync   /dev/sdl1
  10    10       8       81       10      active sync   /dev/sdf1
  11    11       8      193       11      active sync   /dev/sdm1
  12    12       8       97       12      active sync   /dev/sdg1
  13    13       8      209       13      active sync   /dev/sdn1


What's wrong with these read errors?  I just verified -
the error persists, i.e. reading the mentioned sectors
using dd produces the same errors again, so there were
no re-writes there.

Can md handle this situation gracefully?

Thanks!

/mjt

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: read errors (in superblock?) aren't fixed by md?
  2010-11-12 13:56 read errors (in superblock?) aren't fixed by md? Michael Tokarev
@ 2010-11-12 19:12 ` Neil Brown
  2010-11-16  8:58   ` Michael Tokarev
  0 siblings, 1 reply; 3+ messages in thread
From: Neil Brown @ 2010-11-12 19:12 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: linux-raid

On Fri, 12 Nov 2010 16:56:55 +0300
Michael Tokarev <mjt@tls.msk.ru> wrote:

> I noticed a few read errors in dmesg, on drives
> which are parts of a raid10 array:
> 
> sd 0:0:13:0: [sdf] Unhandled sense code
> sd 0:0:13:0: [sdf] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> sd 0:0:13:0: [sdf] Sense Key : Medium Error [current]
> Info fld=0x880c1d9
> sd 0:0:13:0: [sdf] Add. Sense: Unrecovered read error - recommend rewrite the data
> sd 0:0:13:0: [sdf] CDB: Read(10): 28 00 08 80 c0 bf 00 01 80 00
> end_request: I/O error, dev sdf, sector 142655961
> 
> sd 0:0:11:0: [sdd] Unhandled sense code
> sd 0:0:11:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> sd 0:0:11:0: [sdd] Sense Key : Medium Error [current]
> Info fld=0x880c3e5
> sd 0:0:11:0: [sdd] Add. Sense: Unrecovered read error - recommend rewrite the data
> sd 0:0:11:0: [sdd] CDB: Read(10): 28 00 08 80 c2 3f 00 02 00 00
> end_request: I/O error, dev sdd, sector 142656485
> 
> Both sdf and sdd are parts of the same (raid10) array,
> and this array is the only usage for these drives (i.e.,
> there's nothing else reading them).  Both the mentioned
> locations are near the end of the only partition on
> these drives:
> 
> # partition table of /dev/sdf
> unit: sectors
> /dev/sdf1 : start=       63, size=142657137, Id=83
> 
> (the same partition table is on /dev/sdd too).
> 
> Sector 142657200 is the start of the next (non-existing)
> partition, so the last sector of the first partition is
> 142657199.
> 
> Now, we've read errors on sectors 142655961 (sdf)
> and 142656485 (sdd), which are 1239 and 715 sectors
> before the end of the partition, respectively.
> 
> The array is this:
> 
> # mdadm -E /dev/sdf1
> /dev/sdf1:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : 1c49b395:293761c8:4113d295:43412a46
>   Creation Time : Sun Jun 27 04:37:12 2010
>      Raid Level : raid10
>   Used Dev Size : 71328256 (68.02 GiB 73.04 GB)
>      Array Size : 499297792 (476.17 GiB 511.28 GB)
>    Raid Devices : 14
>   Total Devices : 14
> Preferred Minor : 11
> 
>     Update Time : Fri Nov 12 16:55:06 2010
>           State : clean
> Internal Bitmap : present
>  Active Devices : 14
> Working Devices : 14
>  Failed Devices : 0
>   Spare Devices : 0
>        Checksum : 104a3529 - correct
>          Events : 16790
> 
>          Layout : near=2, far=1
>      Chunk Size : 256K
> 
>       Number   Major   Minor   RaidDevice State
> this    10       8       81       10      active sync   /dev/sdf1
>    0     0       8        1        0      active sync   /dev/sda1
>    1     1       8      113        1      active sync   /dev/sdh1
>    2     2       8       17        2      active sync   /dev/sdb1
>    3     3       8      129        3      active sync   /dev/sdi1
>    4     4       8       33        4      active sync   /dev/sdc1
>    5     5       8      145        5      active sync   /dev/sdj1
>    6     6       8       49        6      active sync   /dev/sdd1
>    7     7       8      161        7      active sync   /dev/sdk1
>    8     8       8       65        8      active sync   /dev/sde1
>    9     9       8      177        9      active sync   /dev/sdl1
>   10    10       8       81       10      active sync   /dev/sdf1
>   11    11       8      193       11      active sync   /dev/sdm1
>   12    12       8       97       12      active sync   /dev/sdg1
>   13    13       8      209       13      active sync   /dev/sdn1
> 
> 
> What's wrong with these read errors?  I just verified -
> the error persists, i.e. reading the mentioned sectors
> using dd produces the same errors again, so there were
> no re-writes there.
> 
> Can md handle this situation gracefully?

These sectors would be in the internal bitmap which starts at 142657095
and ends before 142657215.

The bitmap is read from just one device when the array is assembled, then
written to all devices when it is modified.

I'm not sure off-hand exactly how md would handle read errors.  I would
expect it to just disable the bitmap, but it doesn't appear to be doing
that... odd.  I would need to investigate more.

You should be able to get md to over-write the area by removing the internal
bitmap and adding it back (with --grow --bitmap=none / --grow
--bitmap=internal).

NeilBrown


> 
> Thanks!
> 
> /mjt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: read errors (in superblock?) aren't fixed by md?
  2010-11-12 19:12 ` Neil Brown
@ 2010-11-16  8:58   ` Michael Tokarev
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Tokarev @ 2010-11-16  8:58 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

12.11.2010 22:12, Neil Brown wrote:
> On Fri, 12 Nov 2010 16:56:55 +0300
> Michael Tokarev <mjt@tls.msk.ru> wrote:
> 
>> end_request: I/O error, dev sdf, sector 142655961
>> end_request: I/O error, dev sdd, sector 142656485
>>
>> Both sdf and sdd are parts of the same (raid10) array,

>> # partition table of /dev/sdf
>> unit: sectors
>> /dev/sdf1 : start=       63, size=142657137, Id=83
>>
>> Now, we've read errors on sectors 142655961 (sdf)
>> and 142656485 (sdd), which are 1239 and 715 sectors
>> before the end of the partition, respectively.
>>
>>           Magic : a92b4efc
>>         Version : 00.90.00
>>   Used Dev Size : 71328256 (68.02 GiB 73.04 GB)
>>      Array Size : 499297792 (476.17 GiB 511.28 GB)
>> Internal Bitmap : present
>>  Active Devices : 14
>>          Layout : near=2, far=1
>>      Chunk Size : 256K
>>
>> What's wrong with these read errors?  I just verified -
>> the error persists, i.e. reading the mentioned sectors
>> using dd produces the same errors again, so there were
>> no re-writes there.
>>
>> Can md handle this situation gracefully?
> 
> These sectors would be in the internal bitmap which starts at 142657095
> and ends before 142657215.
> 
> The bitmap is read from just one device when the array is assembled, then
> written to all devices when it is modified.

In this case there should be no reason to read these areas
in the first place.  The read errors happened during regular
operations, the machine had uptime of about 30 days and the
array were in use since boot.  A few days before verify pass
has been completed successfully.

> I'm not sure off-hand exactly how md would handle read errors.  I would
> expect it to just disable the bitmap, but it doesn't appear to be doing
> that... odd.  I would need to investigate more.

Again, it depends on why it tried to _read_ these areas to
start with.

> You should be able to get md to over-write the area by removing the internal
> bitmap and adding it back (with --grow --bitmap=none / --grow
> --bitmap=internal).

I tried this - no, it appears md[adm] does not write into there.
Neither of the two disks were fixed by this.

I tried to re-write them manually using dd, but it's very error-prone
so I rewrote only 2 sectors, very carefully (it appears there are more
bads in these areas, sector with the next number is also unreadable) -
and it stays fixed, drive just remapped them and increased Reallocated
Sector Count (from 0 to 2 - for a 72Gb drive this is nothing).

Since this is an important production array, I went ahead and
reconfigured it, completely - first I changed partitions to
end before the problem area (and start later too, just in case -- moved
the beginning from 63s to 1M), and created a bunch of raid1 arrays
instead of single raid10 (on the array there was Oracle database
with multiple files, so it's easy to distribute them across multiple
filesystems).

I created bitmaps again, now in a different location, let's see how
it all will work...

But there are a few questions remains still.

1) what is located in these areas?  If it is bitmap, md should
   rewrite them during bitmap creation.  Maybe the bitmap were
   smaller (I used --bitmap-chunk=4096 iirc)?  Again, if it
   were, what was in these places, and why/who tried to read it?

2) how to force mdadm to correct these, without risking to
   over-write something so that the array wont work?

3) (probably not related to md but)  It is interesting that
   several disks at once developed bad sectors in the same
   area.  From 14 drives, I noticed 5 problematic - 2 with
   real bad blocks and 3 more with long delays while reading
   these areas (this is what prompted me to reconfigure
   array).  They're even from different vendors.  In theory,
   modern hard drives should not suffer even from multiple
   writes to the same area (as it can be for high-write-
   intensive bitmap areas, but due to (1) above it isn't
   clear what is in there).  I've no explanation here.

4) (related but different)  Is there a way to force md to
   re-write or check a particular place on one of the
   components?  While trying to fix the unreadable sector
   I hit /dev/sdf somewhere in the middle and had to remove
   it from the array, remove the bitmap and add it back,
   just to be sure md will write right info into that sector...

Thanks!

/mjt

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-11-16  8:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-12 13:56 read errors (in superblock?) aren't fixed by md? Michael Tokarev
2010-11-12 19:12 ` Neil Brown
2010-11-16  8:58   ` Michael Tokarev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).