RAID mirror, resyncing from bad disk

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID mirror, resyncing from bad disk
@ 2009-09-13 19:19 Ronny Adsetts
  2009-09-13 20:24 ` John Robinson
  0 siblings, 1 reply; 6+ messages in thread
From: Ronny Adsetts @ 2009-09-13 19:19 UTC (permalink / raw)
  To: Linux RAID ML

[-- Attachment #1: Type: text/plain, Size: 12825 bytes --]

Hi,

I've found myself in a situation that I'm unable to resolve hence this request for help.

A messed up /etc/mdadm/mdadm.conf led to one of the disk in a RAID 1 mirrored pair being kicked from the array. Unfortunately, the disk left in the array has bad sectors so I can't resync the good disk back in so I can fail the bad disk and replace it.

/proc/mdstat looks like this:

Personalities : [raid1]
md0 : active raid1 sdb1[0] sdc1[1]
      489856 blocks [2/2] [UU]

md1 : active raid1 sdb2[0] sdc2[2](F)
      1951808 blocks [2/1] [U_]

md2 : active raid1 sdb3[2](F) sdc3[1]
      114776320 blocks [2/1] [_U]

unused devices: <none>

The problematic array is /dev/md2 and the dying disk is /dev/sdc.

When I try to resync it gets to about 99.2% then gives load of I/O errors in /var/log/kern.log and finally gives up and restarts the sync.

Ideally I just want to tell the system to ignore the bad sector and just resync the array.

Does anyone have any ideas on how I can get this resolved short of reinstalling? This is a production server so I'd like to avoid the downtime if at all possible. (I have all the important stuff backed up on tape; /dev/md2 is mainly system stuff).

Other misc. info:

$ mdadm --version
mdadm - v2.6.2 - 21st May 2007

$ uname -a
Linux vimes 2.6.24-etchnhalf.1-amd64 #1 SMP Sat Aug 15 20:38:41 UTC 2009 x86_64 GNU/Linux

The system is sata disks for the md2 array. This is then an LVM volume which is partitioned up in to about 5 partitions all using the XFS filesystem.

Logs for a resync:
---
Aug 19 12:20:54 vimes mdadm: Rebuild80 event detected on md device /dev/md2
Aug 19 12:38:26 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:38:26 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:38:26 vimes kernel: ata2.00: cmd 25/00:00:b0:e2:f1/00:04:0d:00:00/e0 tag 0 dma 524288 in
Aug 19 12:38:26 vimes kernel:          res 51/40:55:05:e4:f1/40:01:0d:00:00/e0 Emask 0x9 (media error)
Aug 19 12:38:26 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:38:26 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:38:26 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:38:26 vimes kernel: ata2: EH complete
Aug 19 12:38:30 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:38:30 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:38:30 vimes kernel: ata2.00: cmd 25/00:00:b0:e2:f1/00:04:0d:00:00/e0 tag 0 dma 524288 in
Aug 19 12:38:30 vimes kernel:          res 51/40:55:05:e4:f1/40:01:0d:00:00/e0 Emask 0x9 (media error)
Aug 19 12:38:30 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:38:30 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:38:30 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:38:30 vimes kernel: ata2: EH complete
Aug 19 12:38:33 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:38:33 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:38:33 vimes kernel: ata2.00: cmd 25/00:00:b0:e2:f1/00:04:0d:00:00/e0 tag 0 dma 524288 in
Aug 19 12:38:33 vimes kernel:          res 51/40:55:05:e4:f1/40:01:0d:00:00/e0 Emask 0x9 (media error)
Aug 19 12:38:33 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:38:33 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:38:33 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:38:33 vimes kernel: ata2: EH complete
Aug 19 12:38:37 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:38:37 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:38:37 vimes kernel: ata2.00: cmd 25/00:00:b0:e2:f1/00:04:0d:00:00/e0 tag 0 dma 524288 in
Aug 19 12:38:37 vimes kernel:          res 51/40:55:05:e4:f1/40:01:0d:00:00/e0 Emask 0x9 (media error)
Aug 19 12:38:37 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:38:37 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:38:37 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:38:37 vimes kernel: ata2: EH complete
Aug 19 12:38:41 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:38:41 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:38:41 vimes kernel: ata2.00: cmd 25/00:00:b0:e2:f1/00:04:0d:00:00/e0 tag 0 dma 524288 in
Aug 19 12:38:41 vimes kernel:          res 51/40:55:05:e4:f1/40:01:0d:00:00/e0 Emask 0x9 (media error)
Aug 19 12:38:41 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:38:41 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:38:41 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:38:41 vimes kernel: ata2: EH complete
Aug 19 12:38:45 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:38:45 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:38:45 vimes kernel: ata2.00: cmd 25/00:00:b0:e2:f1/00:04:0d:00:00/e0 tag 0 dma 524288 in
Aug 19 12:38:45 vimes kernel:          res 51/40:55:05:e4:f1/40:01:0d:00:00/e0 Emask 0x9 (media error)
Aug 19 12:38:45 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:38:45 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:38:45 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:38:45 vimes kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
Aug 19 12:38:45 vimes kernel: sd 1:0:0:0: [sdb] Sense Key : Medium Error [current] [descriptor]
Aug 19 12:38:45 vimes kernel: Descriptor sense data with sense descriptors (in hex):
Aug 19 12:38:45 vimes kernel:         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Aug 19 12:38:45 vimes kernel:         0d f1 e4 05
Aug 19 12:38:45 vimes kernel: sd 1:0:0:0: [sdb] Add. Sense: Unrecovered read error - auto reallocate failed
Aug 19 12:38:45 vimes kernel: end_request: I/O error, dev sdb, sector 233956357
Aug 19 12:38:45 vimes kernel: ata2: EH complete
Aug 19 12:38:45 vimes kernel: sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB)
Aug 19 12:38:45 vimes kernel: sd 1:0:0:0: [sdb] Write Protect is off
Aug 19 12:38:45 vimes kernel: sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Aug 19 12:38:45 vimes kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 19 12:38:45 vimes kernel: sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB)
Aug 19 12:38:45 vimes kernel: sd 1:0:0:0: [sdb] Write Protect is off
Aug 19 12:38:45 vimes kernel: sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Aug 19 12:38:45 vimes kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 19 12:38:49 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:38:49 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:38:49 vimes kernel: ata2.00: cmd c8/00:08:00:e4:f1/00:00:00:00:00/ed tag 0 dma 4096 in
Aug 19 12:38:49 vimes kernel:          res 51/40:05:05:e4:f1/40:01:0d:00:00/ed Emask 0x9 (media error)
Aug 19 12:38:49 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:38:49 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:38:49 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:38:49 vimes kernel: ata2: EH complete
Aug 19 12:38:52 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:38:52 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:38:52 vimes kernel: ata2.00: cmd c8/00:08:00:e4:f1/00:00:00:00:00/ed tag 0 dma 4096 in
Aug 19 12:38:52 vimes kernel:          res 51/40:05:05:e4:f1/40:01:0d:00:00/ed Emask 0x9 (media error)
Aug 19 12:38:52 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:38:52 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:38:53 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:38:53 vimes kernel: ata2: EH complete
Aug 19 12:38:56 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:38:56 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:38:56 vimes kernel: ata2.00: cmd c8/00:08:00:e4:f1/00:00:00:00:00/ed tag 0 dma 4096 in
Aug 19 12:38:56 vimes kernel:          res 51/40:05:05:e4:f1/40:01:0d:00:00/ed Emask 0x9 (media error)
Aug 19 12:38:56 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:38:56 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:38:56 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:38:56 vimes kernel: ata2: EH complete
Aug 19 12:39:00 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:39:00 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:39:00 vimes kernel: ata2.00: cmd c8/00:08:00:e4:f1/00:00:00:00:00/ed tag 0 dma 4096 in
Aug 19 12:39:00 vimes kernel:          res 51/40:05:05:e4:f1/40:01:0d:00:00/ed Emask 0x9 (media error)
Aug 19 12:39:00 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:39:00 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:39:00 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:39:00 vimes kernel: ata2: EH complete
Aug 19 12:39:04 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:39:04 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:39:04 vimes kernel: ata2.00: cmd c8/00:08:00:e4:f1/00:00:00:00:00/ed tag 0 dma 4096 in
Aug 19 12:39:04 vimes kernel:          res 51/40:05:05:e4:f1/40:01:0d:00:00/ed Emask 0x9 (media error)
Aug 19 12:39:04 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:39:04 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:39:04 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:39:04 vimes kernel: ata2: EH complete
Aug 19 12:39:08 vimes kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 19 12:39:08 vimes kernel: ata2.00: BMDMA stat 0x24
Aug 19 12:39:08 vimes kernel: ata2.00: cmd c8/00:08:00:e4:f1/00:00:00:00:00/ed tag 0 dma 4096 in
Aug 19 12:39:08 vimes kernel:          res 51/40:05:05:e4:f1/40:01:0d:00:00/ed Emask 0x9 (media error)
Aug 19 12:39:08 vimes kernel: ata2.00: status: { DRDY ERR }
Aug 19 12:39:08 vimes kernel: ata2.00: error: { UNC }
Aug 19 12:39:08 vimes kernel: ata2.00: configured for UDMA/133
Aug 19 12:39:08 vimes kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
Aug 19 12:39:08 vimes kernel: sd 1:0:0:0: [sdb] Sense Key : Medium Error [current] [descriptor]
Aug 19 12:39:08 vimes kernel: Descriptor sense data with sense descriptors (in hex):
Aug 19 12:39:08 vimes kernel:         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Aug 19 12:39:08 vimes kernel:         0d f1 e4 05
Aug 19 12:39:08 vimes kernel: sd 1:0:0:0: [sdb] Add. Sense: Unrecovered read error - auto reallocate failed
Aug 19 12:39:08 vimes kernel: end_request: I/O error, dev sdb, sector 233956357
Aug 19 12:39:08 vimes kernel: raid1: sdb: unrecoverable I/O read error for block 229072512
Aug 19 12:39:08 vimes kernel: ata2: EH complete
Aug 19 12:39:08 vimes kernel: sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB)
Aug 19 12:39:08 vimes kernel: sd 1:0:0:0: [sdb] Write Protect is off
Aug 19 12:39:08 vimes kernel: sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Aug 19 12:39:08 vimes kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 19 12:39:08 vimes kernel: sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB)
Aug 19 12:39:08 vimes kernel: sd 1:0:0:0: [sdb] Write Protect is off
Aug 19 12:39:08 vimes kernel: sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Aug 19 12:39:08 vimes kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug 19 12:39:08 vimes kernel: RAID1 conf printout:
Aug 19 12:39:08 vimes kernel:  --- wd:1 rd:2
Aug 19 12:39:08 vimes kernel:  disk 0, wo:1, o:1, dev:sda3
Aug 19 12:39:08 vimes kernel:  disk 1, wo:0, o:1, dev:sdb3
Aug 19 12:39:08 vimes kernel: RAID1 conf printout:
Aug 19 12:39:08 vimes kernel:  --- wd:1 rd:2
Aug 19 12:39:08 vimes kernel:  disk 1, wo:0, o:1, dev:sdb3
Aug 19 12:39:08 vimes kernel: RAID1 conf printout:
Aug 19 12:39:08 vimes kernel:  --- wd:1 rd:2
Aug 19 12:39:08 vimes kernel:  disk 0, wo:1, o:1, dev:sda3
Aug 19 12:39:08 vimes kernel:  disk 1, wo:0, o:1, dev:sdb3
Aug 19 12:39:08 vimes kernel: md: recovery of RAID array md2
Aug 19 12:39:08 vimes kernel: md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
Aug 19 12:39:08 vimes mdadm: RebuildFinished event detected on md device /dev/md2
Aug 19 12:39:08 vimes kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
Aug 19 12:39:08 vimes kernel: md: using 128k window, over a total of 114776320 blocks.
Aug 19 12:39:08 vimes mdadm: RebuildStarted event detected on md device /dev/md2
Aug 19 12:51:08 vimes mdadm: Rebuild20 event detected on md device /dev/md2
---

Thanks for any help.

Ronny
-- 
Ronny Adsetts
Technical Director
Amazing Internet Ltd, London
t: +44 20 8607 9535
f: +44 20 8607 9536
w: www.amazinginternet.com

Registered office: UK House, 82 Heath Road, Twickenham TW1 4BW
Registered in England. Company No. 4042957 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RAID mirror, resyncing from bad disk
  2009-09-13 19:19 RAID mirror, resyncing from bad disk Ronny Adsetts
@ 2009-09-13 20:24 ` John Robinson
  2009-09-13 22:49   ` Majed B.
  2009-09-14  9:13   ` Ronny Adsetts
  0 siblings, 2 replies; 6+ messages in thread
From: John Robinson @ 2009-09-13 20:24 UTC (permalink / raw)
  To: Ronny Adsetts; +Cc: Linux RAID ML

On 13/09/2009 20:19, Ronny Adsetts wrote:
[...]
> The problematic array is /dev/md2 and the dying disk is /dev/sdc.
> 
> When I try to resync it gets to about 99.2% then gives load of I/O errors in /var/log/kern.log and finally gives up and restarts the sync.
> 
> Ideally I just want to tell the system to ignore the bad sector and just resync the array.
> 
> Does anyone have any ideas on how I can get this resolved short of reinstalling?

You need to ddrescue the contents of the dying drive onto a good one, 
then replace the dying drive with the fresh one. This will require 
taking the array down but you won't have to reinstall, unless you've 
lost something important in the lost sectors on the dying drive.

Cheers,

John.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RAID mirror, resyncing from bad disk
  2009-09-13 20:24 ` John Robinson
@ 2009-09-13 22:49   ` Majed B.
  2009-09-14  9:00     ` Ronny Adsetts
  2009-09-14  9:13   ` Ronny Adsetts
  1 sibling, 1 reply; 6+ messages in thread
From: Majed B. @ 2009-09-13 22:49 UTC (permalink / raw)
  To: John Robinson; +Cc: Ronny Adsetts, Linux RAID ML

If your data is backed up, doing everything from scratch is your best
bet, and I'll tell you why you shouldn't try to patch up your current
situation:

1) Have you written a single byte to the array(s) when the first disk
was thrown out of the array?
If yes, then adding it again will be equal to adding a new hard disk.
If no, then why was it trying to resync? I think it's already flagged
dirty but you've got 99.2% worth of data...

2) Your disk that is currently running the RAID1 array has bad sectors.
If you try to clone that, you'll fail to clone all sectors, which
means loss of data.

3) Ignoring bad sectors means you'll end up with corrupt Physical
Volumes under the LVM (Since your arrays are Physical Volumes of the
LVM). This will also cause problems with the filesystem.

I would suggest buying TWO new disks and reinstall everything and
restore your backup(s) then data. It's better than spending numerous
hours trying to find out why a service isn't working, then after days
you realize that some binaries are corrupt or the config files.

The reason I say TWO new disks, is that you already have one disk with
bad sectors, and the other popped out of the array for a reason. Do
not add the 2nd disk until you're truly sure there's nothing wrong
with it (run smartd and execute short and long tests, try to read the
whole disk a few times using dd: dd if=/dev/sdc of=/de/null bs=10M).

If you see link errors in the kernel/dmesg, change your SATA cables.

>> The problematic array is /dev/md2 and the dying disk is /dev/sdc.
>>
>> When I try to resync it gets to about 99.2% then gives load of I/O errors
>> in /var/log/kern.log and finally gives up and restarts the sync.
>>
>> Ideally I just want to tell the system to ignore the bad sector and just
>> resync the array.
>>
>> Does anyone have any ideas on how I can get this resolved short of
>> reinstalling?
-- 
       Majed B.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RAID mirror, resyncing from bad disk
  2009-09-13 22:49   ` Majed B.
@ 2009-09-14  9:00     ` Ronny Adsetts
  0 siblings, 0 replies; 6+ messages in thread
From: Ronny Adsetts @ 2009-09-14  9:00 UTC (permalink / raw)
  To: Majed B.; +Cc: Linux RAID ML

[-- Attachment #1: Type: text/plain, Size: 2689 bytes --]

Majed B. said at 13/09/2009 23:49:
> If your data is backed up, doing everything from scratch is your best
> bet, and I'll tell you why you shouldn't try to patch up your current
> situation:

I know this is the best way forward, however finding time to do this is harder. Obviously, if push comes to shove, I'll have to do it. Just trying to avoid it.

> 1) Have you written a single byte to the array(s) when the first disk
> was thrown out of the array?
> If yes, then adding it again will be equal to adding a new hard disk.
> If no, then why was it trying to resync? I think it's already flagged
> dirty but you've got 99.2% worth of data...

The good disk was kicked because of a configuration error. There have been writes since then (on the /var partition) hence the required resync.

> 2) Your disk that is currently running the RAID1 array has bad sectors.
> If you try to clone that, you'll fail to clone all sectors, which
> means loss of data.

Understood.

> 3) Ignoring bad sectors means you'll end up with corrupt Physical
> Volumes under the LVM (Since your arrays are Physical Volumes of the
> LVM). This will also cause problems with the filesystem.

The bad sector is in an unallocated part of the PV - there are no LVs on it. Thankfully the bad sector is right at the end of the disk.

> I would suggest buying TWO new disks and reinstall everything and
> restore your backup(s) then data. It's better than spending numerous
> hours trying to find out why a service isn't working, then after days
> you realize that some binaries are corrupt or the config files.

I can checksum all the binaries and compare to a known good system and the config files are all backed up. I've rebooted recently and it all works now with the bad sector present, so I'm not concerned about this since it seems unlikely to happen.

> The reason I say TWO new disks, is that you already have one disk with
> bad sectors, and the other popped out of the array for a reason. Do
> not add the 2nd disk until you're truly sure there's nothing wrong
> with it (run smartd and execute short and long tests, try to read the
> whole disk a few times using dd: dd if=/dev/sdc of=/de/null bs=10M).
> 
> If you see link errors in the kernel/dmesg, change your SATA cables.

I've run SMART tests on the good disk and it appears fine. I'll do a dd of the whole disk to verify before I reuse it.

Ronny
-- 
Ronny Adsetts
Technical Director
Amazing Internet Ltd, London
t: +44 20 8607 9535
f: +44 20 8607 9536
w: www.amazinginternet.com

Registered office: UK House, 82 Heath Road, Twickenham TW1 4BW
Registered in England. Company No. 4042957 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RAID mirror, resyncing from bad disk
  2009-09-13 20:24 ` John Robinson
  2009-09-13 22:49   ` Majed B.
@ 2009-09-14  9:13   ` Ronny Adsetts
  2009-09-14  9:23     ` John Robinson
  1 sibling, 1 reply; 6+ messages in thread
From: Ronny Adsetts @ 2009-09-14  9:13 UTC (permalink / raw)
  To: John Robinson; +Cc: Linux RAID ML

[-- Attachment #1: Type: text/plain, Size: 1345 bytes --]

John Robinson said at 13/09/2009 21:24:
> On 13/09/2009 20:19, Ronny Adsetts wrote:
> [...]
>> The problematic array is /dev/md2 and the dying disk is /dev/sdc.
>>
>> When I try to resync it gets to about 99.2% then gives load of I/O
>> errors in /var/log/kern.log and finally gives up and restarts the sync.
>>
>> Ideally I just want to tell the system to ignore the bad sector and
>> just resync the array.
>>
>> Does anyone have any ideas on how I can get this resolved short of
>> reinstalling?
> 
> You need to ddrescue the contents of the dying drive onto a good one,
> then replace the dying drive with the fresh one. This will require
> taking the array down but you won't have to reinstall, unless you've
> lost something important in the lost sectors on the dying drive.

The bad sector is thankfully right at the end of the disk in an unallocated part of the LVM setup.

So, do I ddrescue the entire disk? I assume this will copy everything including the partition table and other gubbins?

Thanks very much, this looks like a winner to me. :-).

Ronny
-- 
Ronny Adsetts
Technical Director
Amazing Internet Ltd, London
t: +44 20 8607 9535
f: +44 20 8607 9536
w: www.amazinginternet.com

Registered office: UK House, 82 Heath Road, Twickenham TW1 4BW
Registered in England. Company No. 4042957 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RAID mirror, resyncing from bad disk
  2009-09-14  9:13   ` Ronny Adsetts
@ 2009-09-14  9:23     ` John Robinson
  0 siblings, 0 replies; 6+ messages in thread
From: John Robinson @ 2009-09-14  9:23 UTC (permalink / raw)
  To: Ronny Adsetts; +Cc: Linux RAID ML

On 14/09/2009 10:13, Ronny Adsetts wrote:
> John Robinson said at 13/09/2009 21:24:
>> On 13/09/2009 20:19, Ronny Adsetts wrote:
>> [...]
>>> The problematic array is /dev/md2 and the dying disk is /dev/sdc.
>>>
>>> When I try to resync it gets to about 99.2% then gives load of I/O
>>> errors in /var/log/kern.log and finally gives up and restarts the sync.
>>>
>>> Ideally I just want to tell the system to ignore the bad sector and
>>> just resync the array.
>>>
>>> Does anyone have any ideas on how I can get this resolved short of
>>> reinstalling?
>> You need to ddrescue the contents of the dying drive onto a good one,
>> then replace the dying drive with the fresh one. This will require
>> taking the array down but you won't have to reinstall, unless you've
>> lost something important in the lost sectors on the dying drive.
> 
> The bad sector is thankfully right at the end of the disk in an unallocated part of the LVM setup.
> 
> So, do I ddrescue the entire disk? I assume this will copy everything including the partition table and other gubbins?

Yes you do, and yes it will. You don't have to use ddrescue, there are 
other tools - for examples see 
http://www.forensicswiki.org/wiki/Ddrescue - but ddrescue would be my 
choice.

> Thanks very much, this looks like a winner to me. :-).

Good luck with it!

Cheers,

John.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-09-14  9:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-13 19:19 RAID mirror, resyncing from bad disk Ronny Adsetts
2009-09-13 20:24 ` John Robinson
2009-09-13 22:49   ` Majed B.
2009-09-14  9:00     ` Ronny Adsetts
2009-09-14  9:13   ` Ronny Adsetts
2009-09-14  9:23     ` John Robinson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).