Howto avoid full re-sync

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Howto avoid full re-sync
@ 2012-09-07  4:41 Adam Goryachev
  2012-09-07  9:41 ` Ralf Müller
  0 siblings, 1 reply; 8+ messages in thread
From: Adam Goryachev @ 2012-09-07  4:41 UTC (permalink / raw)
  To: Linux RAID

I have a MD raid6 with 5 drives, and every now and then one (random)
drive will fail. I've done all sorts of checks, and the drive is
actually working fine, so I suspect an issue with the Linux driver
and/or SATA controller (onboard).

It isn't really relevant to the question, but I'll run through the sata
stuff, in case anyone can point out a simple solution to stop this from
happening (yes, a new server is on the way, but with budgets etc, that
could be some time away. This issue has happened for years, but we are
becoming more active with these failures now).

00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
(rev a1)
00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
(rev a1)
01:07.0 RAID bus controller: Silicon Image, Inc. Adaptec AAR-1210SA SATA
HostRAID Controller (rev 02)

cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md2 : active raid6 sdh1[5] sdg1[4] sdf1[0] sdd1[6](F) sde1[2] sda1[1]
      5860535808 blocks level 6, 64k chunk, algorithm 2 [5/4] [UUU_U]
      [>....................]  recovery =  1.4% (28663240/1953511936)
finish=486.5min speed=65938K/sec

(As you can see, sdd failed, but the kernel found it again as sdh, so
I've re-added it).

/dev/md2:
        Version : 0.90
  Creation Time : Fri Aug 11 21:45:20 2006
     Raid Level : raid6
     Array Size : 5860535808 (5589.04 GiB 6001.19 GB)
  Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
   Raid Devices : 5
  Total Devices : 6
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Fri Sep  7 14:31:10 2012
          State : clean, degraded, recovering
 Active Devices : 4
Working Devices : 5
 Failed Devices : 1
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

 Rebuild Status : 1% complete

           UUID : e6cfbc82:c23e52da:9cb07c6d:11629c30
         Events : 0.7762116

    Number   Major   Minor   RaidDevice State
       0       8       81        0      active sync   /dev/sdf1
       1       8        1        1      active sync   /dev/sda1
       2       8       65        2      active sync   /dev/sde1
       5       8      113        3      spare rebuilding   /dev/sdh1
       4       8       97        4      active sync   /dev/sdg1

       6       8       49        -      faulty spare

Since I know sdh is actually almost up to date, is there some way to
re-add it, and only have to sync the portions of the disk which have
changed?

Thanks,
Adam

-- 
Adam Goryachev
Website Managers
Ph: +61 2 8304 0000                            adam@websitemanagers.com.au
Fax: +61 2 8304 0001                            www.websitemanagers.com.au

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Howto avoid full re-sync
  2012-09-07  4:41 Howto avoid full re-sync Adam Goryachev
@ 2012-09-07  9:41 ` Ralf Müller
  2012-09-07 12:41   ` Phil Turmel
  2012-09-09 23:11   ` Adam Goryachev
  0 siblings, 2 replies; 8+ messages in thread
From: Ralf Müller @ 2012-09-07  9:41 UTC (permalink / raw)
  To: Adam Goryachev; +Cc: Linux RAID


Am 07.09.2012 um 06:41 schrieb Adam Goryachev:

> I have a MD raid6 with 5 drives, and every now and then one (random)
> drive will fail. I've done all sorts of checks, and the drive is
> actually working fine, so I suspect an issue with the Linux driver
> and/or SATA controller (onboard).
> 
> It isn't really relevant to the question, but I'll run through the sata
> stuff, in case anyone can point out a simple solution to stop this from
> happening (yes, a new server is on the way, but with budgets etc, that
> could be some time away. This issue has happened for years, but we are
> becoming more active with these failures now).
> 
> 00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
> (rev a1)
> 00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
> (rev a1)
> 01:07.0 RAID bus controller: Silicon Image, Inc. Adaptec AAR-1210SA SATA
> HostRAID Controller (rev 02)
> 
> cat /proc/mdstat
> Personalities : [raid1] [raid6] [raid5] [raid4]
> md2 : active raid6 sdh1[5] sdg1[4] sdf1[0] sdd1[6](F) sde1[2] sda1[1]
>      5860535808 blocks level 6, 64k chunk, algorithm 2 [5/4] [UUU_U]
>      [>....................]  recovery =  1.4% (28663240/1953511936)
> finish=486.5min speed=65938K/sec
> 
> 
> 
> Since I know sdh is actually almost up to date, is there some way to
> re-add it, and only have to sync the portions of the disk which have
> changed?


Besides all the stuff about fix your server, a raid is not a backup and you risk your data - simply add a write intent bitmap:

# mdadm /dev/md2 --grow bitmap=internal

Best regards
Ralf

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Howto avoid full re-sync
  2012-09-07  9:41 ` Ralf Müller
@ 2012-09-07 12:41   ` Phil Turmel
  2012-09-09 23:11   ` Adam Goryachev
  1 sibling, 0 replies; 8+ messages in thread
From: Phil Turmel @ 2012-09-07 12:41 UTC (permalink / raw)
  To: Ralf Müller; +Cc: Adam Goryachev, Linux RAID

Hi Adam,

On 09/07/2012 05:41 AM, Ralf Müller wrote:
> 
> Am 07.09.2012 um 06:41 schrieb Adam Goryachev:
> 
>> I have a MD raid6 with 5 drives, and every now and then one (random)
>> drive will fail. I've done all sorts of checks, and the drive is
>> actually working fine, so I suspect an issue with the Linux driver
>> and/or SATA controller (onboard).

In years on this list, most cases of "drive fails out of raid, but
checks out OK" has been a side effect of the mismatch between default
linux controller timeouts (in the drivers) and error recovery timeouts
in non-enterprise drives.

Really.  Search for "scterc" in the list archives.

A few solutions:
1)  Buy enterprise drives that have short timeouts by default.
2)  Buy desktop drives that support SCTERC, and use scripts to set it
every time they are plugged in or booted up.
3)  Change the driver timeouts.

>> It isn't really relevant to the question, but I'll run through the sata
>> stuff, in case anyone can point out a simple solution to stop this from
>> happening (yes, a new server is on the way, but with budgets etc, that
>> could be some time away. This issue has happened for years, but we are
>> becoming more active with these failures now).

If the new server has enterprise drives, it'll all just work.  And
you'll have some physical reliability advantages, too.  My needs haven't
justified the extra expense, though, so I do #2.  At the moment, only
Hitachi is supporting SCTERC in desktop models.

>> 00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
>> (rev a1)
>> 00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
>> (rev a1)
>> 01:07.0 RAID bus controller: Silicon Image, Inc. Adaptec AAR-1210SA SATA
>> HostRAID Controller (rev 02)
>>
>> cat /proc/mdstat
>> Personalities : [raid1] [raid6] [raid5] [raid4]
>> md2 : active raid6 sdh1[5] sdg1[4] sdf1[0] sdd1[6](F) sde1[2] sda1[1]
>>      5860535808 blocks level 6, 64k chunk, algorithm 2 [5/4] [UUU_U]
>>      [>....................]  recovery =  1.4% (28663240/1953511936)
>> finish=486.5min speed=65938K/sec
>>
>>
>>
>> Since I know sdh is actually almost up to date, is there some way to
>> re-add it, and only have to sync the portions of the disk which have
>> changed?
> 
> 
> Besides all the stuff about fix your server, a raid is not a backup and you risk your data - simply add a write intent bitmap:
> 
> # mdadm /dev/md2 --grow bitmap=internal

Definitely add the bitmap.  But that's a band-aid.  If you have a
timeout mismatch, the odds of total failure of your raid6 array is very
high, even with perfectly good disks.  Most desktop drives quote one
unrecoverable read error per 1e14 bits read.  That's only 12TB.  Every
four complete passes through a 3T drive, or taken together, one pass
through four 3T drives.  Hmmm.  Precisely what happens rebuilding a
five-drive raid5 or raid6.

HTH,

Phil

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Howto avoid full re-sync
  2012-09-07  9:41 ` Ralf Müller
  2012-09-07 12:41   ` Phil Turmel
@ 2012-09-09 23:11   ` Adam Goryachev
  2012-09-10  1:02     ` NeilBrown
  2012-09-10 14:12     ` Ralf Müller
  1 sibling, 2 replies; 8+ messages in thread
From: Adam Goryachev @ 2012-09-09 23:11 UTC (permalink / raw)
  To: Linux RAID

On 09/07/2012 07:41 PM, Ralf Müller wrote:
> Am 07.09.2012 um 06:41 schrieb Adam Goryachev:
>
>> I have a MD raid6 with 5 drives, and every now and then one (random)
>> drive will fail. I've done all sorts of checks, and the drive is
>> actually working fine, so I suspect an issue with the Linux driver
>> and/or SATA controller (onboard).
>>
>> It isn't really relevant to the question, but I'll run through the sata
>> stuff, in case anyone can point out a simple solution to stop this from
>> happening (yes, a new server is on the way, but with budgets etc, that
>> could be some time away. This issue has happened for years, but we are
>> becoming more active with these failures now).
>>
>> 00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
>> (rev a1)
>> 00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
>> (rev a1)
>> 01:07.0 RAID bus controller: Silicon Image, Inc. Adaptec AAR-1210SA SATA
>> HostRAID Controller (rev 02)
>>
>> cat /proc/mdstat
>> Personalities : [raid1] [raid6] [raid5] [raid4]
>> md2 : active raid6 sdh1[5] sdg1[4] sdf1[0] sdd1[6](F) sde1[2] sda1[1]
>>       5860535808 blocks level 6, 64k chunk, algorithm 2 [5/4] [UUU_U]
>>       [>....................]  recovery =  1.4% (28663240/1953511936)
>> finish=486.5min speed=65938K/sec
>>
>>
>>
>> Since I know sdh is actually almost up to date, is there some way to
>> re-add it, and only have to sync the portions of the disk which have
>> changed?
>
> Besides all the stuff about fix your server, a raid is not a backup and you risk your data - simply add a write intent bitmap:
>
> # mdadm /dev/md2 --grow bitmap=internal
>
mdadm /dev/md2 --grow bitmap=internal
mdadm: can only add devices to linear arrays

md2 is raid6:
md2 : active raid6 sdh1[3] sdg1[4] sdf1[0] sde1[2] sda1[1]
       5860535808 blocks level 6, 64k chunk, algorithm 2 [5/5] [UUUUU]

Regards,
Adam

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Howto avoid full re-sync
  2012-09-09 23:11   ` Adam Goryachev
@ 2012-09-10  1:02     ` NeilBrown
  2012-09-10 14:12     ` Ralf Müller
  1 sibling, 0 replies; 8+ messages in thread
From: NeilBrown @ 2012-09-10  1:02 UTC (permalink / raw)
  To: Adam Goryachev; +Cc: Linux RAID

[-- Attachment #1: Type: text/plain, Size: 2420 bytes --]

On Mon, 10 Sep 2012 09:11:54 +1000 Adam Goryachev
<adam@websitemanagers.com.au> wrote:

> On 09/07/2012 07:41 PM, Ralf Müller wrote:
> > Am 07.09.2012 um 06:41 schrieb Adam Goryachev:
> >
> >> I have a MD raid6 with 5 drives, and every now and then one (random)
> >> drive will fail. I've done all sorts of checks, and the drive is
> >> actually working fine, so I suspect an issue with the Linux driver
> >> and/or SATA controller (onboard).
> >>
> >> It isn't really relevant to the question, but I'll run through the sata
> >> stuff, in case anyone can point out a simple solution to stop this from
> >> happening (yes, a new server is on the way, but with budgets etc, that
> >> could be some time away. This issue has happened for years, but we are
> >> becoming more active with these failures now).
> >>
> >> 00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
> >> (rev a1)
> >> 00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
> >> (rev a1)
> >> 01:07.0 RAID bus controller: Silicon Image, Inc. Adaptec AAR-1210SA SATA
> >> HostRAID Controller (rev 02)
> >>
> >> cat /proc/mdstat
> >> Personalities : [raid1] [raid6] [raid5] [raid4]
> >> md2 : active raid6 sdh1[5] sdg1[4] sdf1[0] sdd1[6](F) sde1[2] sda1[1]
> >>       5860535808 blocks level 6, 64k chunk, algorithm 2 [5/4] [UUU_U]
> >>       [>....................]  recovery =  1.4% (28663240/1953511936)
> >> finish=486.5min speed=65938K/sec
> >>
> >>
> >>
> >> Since I know sdh is actually almost up to date, is there some way to
> >> re-add it, and only have to sync the portions of the disk which have
> >> changed?
> >
> > Besides all the stuff about fix your server, a raid is not a backup and you risk your data - simply add a write intent bitmap:
> >
> > # mdadm /dev/md2 --grow bitmap=internal
> >
> mdadm /dev/md2 --grow bitmap=internal
> mdadm: can only add devices to linear arrays

Check the man page....


 mdadm /dev/md2 --grow --bitmap=internal

NeilBrown


> 
> md2 is raid6:
> md2 : active raid6 sdh1[3] sdg1[4] sdf1[0] sde1[2] sda1[1]
>        5860535808 blocks level 6, 64k chunk, algorithm 2 [5/5] [UUUUU]
> 
> Regards,
> Adam
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Howto avoid full re-sync
  2012-09-09 23:11   ` Adam Goryachev
  2012-09-10  1:02     ` NeilBrown
@ 2012-09-10 14:12     ` Ralf Müller
  2012-09-12 12:52       ` Adam Goryachev
  1 sibling, 1 reply; 8+ messages in thread
From: Ralf Müller @ 2012-09-10 14:12 UTC (permalink / raw)
  To: Adam Goryachev; +Cc: Linux RAID


Am 10.09.2012 um 01:11 schrieb Adam Goryachev:
>> 
>> Besides all the stuff about fix your server, a raid is not a backup and you risk your data - simply add a write intent bitmap:
>> 
>> # mdadm /dev/md2 --grow bitmap=internal
>> 
> mdadm /dev/md2 --grow bitmap=internal
> mdadm: can only add devices to linear arrays
> 

My fault: 

mdadm /dev/md2 --grow --bitmap=internal

Ralf



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Howto avoid full re-sync
  2012-09-10 14:12     ` Ralf Müller
@ 2012-09-12 12:52       ` Adam Goryachev
  2012-09-12 13:28         ` John Robinson
  0 siblings, 1 reply; 8+ messages in thread
From: Adam Goryachev @ 2012-09-12 12:52 UTC (permalink / raw)
  To: Ralf Müller; +Cc: Linux RAID

On 11/09/12 00:12, Ralf Müller wrote:
> Besides all the stuff about fix your server, a raid is not a backup and you risk your data - simply add a write intent bitmap:
>
> mdadm /dev/md2 --grow --bitmap=internal
I've added this across a number of my systems now, and it seems to work
really well (Thank You), especially one system which has a RAID1 with an
external USB drive + internal drive which normally take over 2 days for
a full resync.

Can you suggest if there is any dis-advantage to using the bitmap (maybe
write performance will suffer), or disk space is reduced, or ....

While I can see the benefits, I'm just wondering if it might be too good
to be true, or what I am missing....

Thanks again for your assistance.

Regards,
Adam

-- 
Adam Goryachev
Website Managers
Ph: +61 2 8304 0000                            adam@websitemanagers.com.au
Fax: +61 2 8304 0001                            www.websitemanagers.com.au

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Howto avoid full re-sync
  2012-09-12 12:52       ` Adam Goryachev
@ 2012-09-12 13:28         ` John Robinson
  0 siblings, 0 replies; 8+ messages in thread
From: John Robinson @ 2012-09-12 13:28 UTC (permalink / raw)
  To: Adam Goryachev; +Cc: Ralf Müller, Linux RAID

On 12/09/2012 13:52, Adam Goryachev wrote:
> On 11/09/12 00:12, Ralf Müller wrote:
>> Besides all the stuff about fix your server, a raid is not a backup and you risk your data - simply add a write intent bitmap:
>>
>> mdadm /dev/md2 --grow --bitmap=internal
> I've added this across a number of my systems now, and it seems to work
> really well (Thank You), especially one system which has a RAID1 with an
> external USB drive + internal drive which normally take over 2 days for
> a full resync.

It may also be worth reading up on --write-mostly for your external USB 
drive, particularly if your workload tends to be single long streaming 
reads rather than lots of small parallel ones.

> Can you suggest if there is any dis-advantage to using the bitmap (maybe
> write performance will suffer), or disk space is reduced, or ....
>
> While I can see the benefits, I'm just wondering if it might be too good
> to be true, or what I am missing....
>
> Thanks again for your assistance.

Yes, write performance suffers, especially if you have a small bitmap 
chunk size, and the default is as small as possible for the array. See 
`mdadm -X /dev/sdX` on one of the components of your array for what the 
default was calculated to be for your array, and read the --bitmap-chunk 
section of `man mdadm` for a description of bitmap chunk size 
considerations. I found a bitmap chunk of 128MB (131072KB) kept write 
perfomance very near no-bitmap speeds (both MB/s and IOPS) but kept 
resync times fast (a few seconds).

I'm not sure, but you may have to remove the bitmap (--grow 
--bitmap=none) before re-adding one with a different chunk size (e.g. 
--grow --bitmap=internal --bitmap-chunk=131072).

Cheers,

John.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-09-12 13:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-07  4:41 Howto avoid full re-sync Adam Goryachev
2012-09-07  9:41 ` Ralf Müller
2012-09-07 12:41   ` Phil Turmel
2012-09-09 23:11   ` Adam Goryachev
2012-09-10  1:02     ` NeilBrown
2012-09-10 14:12     ` Ralf Müller
2012-09-12 12:52       ` Adam Goryachev
2012-09-12 13:28         ` John Robinson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).