No resync

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* No resync
@ 2010-01-12  4:44 Richard Scobie
  2010-01-12  4:52 ` Leslie Rhorer
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Scobie @ 2010-01-12  4:44 UTC (permalink / raw)
  To: linux-raid

Is there a way to  force assemble a stopped array that has too many 
failed devices, while disabling the resync that would normally occur?

Regards,

Richard

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: No resync
  2010-01-12  4:44 No resync Richard Scobie
@ 2010-01-12  4:52 ` Leslie Rhorer
  2010-01-12  4:58   ` Richard Scobie
  0 siblings, 1 reply; 6+ messages in thread
From: Leslie Rhorer @ 2010-01-12  4:52 UTC (permalink / raw)
  To: 'Richard Scobie', linux-raid

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Richard Scobie
> Sent: Monday, January 11, 2010 10:45 PM
> To: linux-raid@vger.kernel.org
> Subject: No resync
> 
> Is there a way to  force assemble a stopped array that has too many
> failed devices, while disabling the resync that would normally occur?

	Well, of course if the array has too few members to start (even
after the --force), then it won't start and thus won't recync, but assuming
one has enough targets to assemble and run the array, I believe
--asume-clean will do what you want.  It's in the man page.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: No resync
  2010-01-12  4:52 ` Leslie Rhorer
@ 2010-01-12  4:58   ` Richard Scobie
  2010-01-12  8:37     ` Robin Hill
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Scobie @ 2010-01-12  4:58 UTC (permalink / raw)
  To: linux-raid

Leslie Rhorer wrote:

>> Is there a way to  force assemble a stopped array that has too many
>> failed devices, while disabling the resync that would normally occur?
>
> 	Well, of course if the array has too few members to start (even
> after the --force), then it won't start and thus won't recync, but assuming
> one has enough targets to assemble and run the array, I believe
> --asume-clean will do what you want.  It's in the man page.
>

By using force as part of an assemble, all failed devices set to healty 
again and a resync is immediately started.

According to the man page --asume-clean is not an available option for 
assemble.

I have tried --read-only but that is not an otion either.

I need to reassemble without resyncing so I can fsck first.

Regards,

Richard

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: No resync
  2010-01-12  4:58   ` Richard Scobie
@ 2010-01-12  8:37     ` Robin Hill
  2010-01-12 18:27       ` Richard Scobie
  0 siblings, 1 reply; 6+ messages in thread
From: Robin Hill @ 2010-01-12  8:37 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1237 bytes --]

On Tue Jan 12, 2010 at 05:58:10PM +1300, Richard Scobie wrote:

> Leslie Rhorer wrote:
> 
> >> Is there a way to  force assemble a stopped array that has too many
> >> failed devices, while disabling the resync that would normally occur?
> >
> > 	Well, of course if the array has too few members to start (even
> > after the --force), then it won't start and thus won't recync, but assuming
> > one has enough targets to assemble and run the array, I believe
> > --asume-clean will do what you want.  It's in the man page.
> >
> 
> By using force as part of an assemble, all failed devices set to healty 
> again and a resync is immediately started.
> 
> According to the man page --asume-clean is not an available option for 
> assemble.
> 
> I have tried --read-only but that is not an otion either.
> 
> I need to reassemble without resyncing so I can fsck first.
> 
You probably need to start it with missing members then, so it's able to
run but not to resync.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: No resync
  2010-01-12  8:37     ` Robin Hill
@ 2010-01-12 18:27       ` Richard Scobie
  2010-01-18  3:48         ` Neil Brown
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Scobie @ 2010-01-12 18:27 UTC (permalink / raw)
  To: linux-raid

Robin Hill wrote:

> You probably need to start it with missing members then, so it's able to
> run but not to resync.

This is not an option in assemble mode either. It looks as though the 
array has to be recreated. I'm not sure why any of these options are not 
provided for assemble.

Anyway, in the end I did an "assemble --force" after stopping what was 
left (5 drives dropped from a 16 drive RAID6), and it strated but did 
not initiate a resync.

Perhaps the behaviour here has changed, because I'm sure when I've done 
this in the past, it resyncs straight away.

There were some somewhat strange errors in the log:

Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdf, sector 
1953182527
Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdf1, disabling device.
Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 15 devices.
Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdh, sector 
1953182527
Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdh1, disabling device.
Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 14 devices.
Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdg, sector 
1953182527
Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdg1, disabling device.
Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 13 devices.
Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdp, sector 
1953182527
Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdp1, disabling device.
Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 12 devices.
Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdr, sector 
1953182527
Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdr1, disabling device.
Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 11 devices.

The cause is a controller problem, but after the first 2 drives were 
disabled, I don't know why there were "raid5: Operation continuing 
on..." messages as another 3 drives were offlined. A RAID6 array should 
stop when a third device fails.

Regards,

Richard

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: No resync
  2010-01-12 18:27       ` Richard Scobie
@ 2010-01-18  3:48         ` Neil Brown
  0 siblings, 0 replies; 6+ messages in thread
From: Neil Brown @ 2010-01-18  3:48 UTC (permalink / raw)
  To: Richard Scobie; +Cc: linux-raid

On Wed, 13 Jan 2010 07:27:20 +1300
Richard Scobie <richard@sauce.co.nz> wrote:

> Robin Hill wrote:
> 
> > You probably need to start it with missing members then, so it's able to
> > run but not to resync.
> 
> This is not an option in assemble mode either. It looks as though the 
> array has to be recreated. I'm not sure why any of these options are not 
> provided for assemble.
> 
> Anyway, in the end I did an "assemble --force" after stopping what was 
> left (5 drives dropped from a 16 drive RAID6), and it strated but did 
> not initiate a resync.

This is what I would expect.  When mdadm need to "fix" the array to get it to
assemble, it does the minimum work necessary to get the data available.  That
means that it normally won't add any 'redundant' data so no resync will
happen. (RAID10 is a bit of an exception for complex reasons that I don't
want to go in to at the moment).

> 
> Perhaps the behaviour here has changed, because I'm sure when I've done 
> this in the past, it resyncs straight away.

I would be surprised... (but that does happen).

> 
> There were some somewhat strange errors in the log:
> 
> Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdf, sector 
> 1953182527
> Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
> Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdf1, disabling device.
> Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 15 devices.
> Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdh, sector 
> 1953182527
> Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
> Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdh1, disabling device.
> Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 14 devices.
> Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdg, sector 
> 1953182527
> Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
> Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdg1, disabling device.
> Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 13 devices.
> Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdp, sector 
> 1953182527
> Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
> Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdp1, disabling device.
> Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 12 devices.
> Jan 12 17:09:09 sam kernel: end_request: I/O error, dev sdr, sector 
> 1953182527
> Jan 12 17:09:09 sam kernel: md: super_written gets error=-5, uptodate=0
> Jan 12 17:09:09 sam kernel: raid5: Disk failure on sdr1, disabling device.
> Jan 12 17:09:09 sam kernel: raid5: Operation continuing on 11 devices.
> 
> The cause is a controller problem, but after the first 2 drives were 
> disabled, I don't know why there were "raid5: Operation continuing 
> on..." messages as another 3 drives were offlined. A RAID6 array should 
> stop when a third device fails.

Yes it should .... I really should tidy that code up.

NeilBrown


> 
> Regards,
> 
> Richard
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-01-18  3:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-12  4:44 No resync Richard Scobie
2010-01-12  4:52 ` Leslie Rhorer
2010-01-12  4:58   ` Richard Scobie
2010-01-12  8:37     ` Robin Hill
2010-01-12 18:27       ` Richard Scobie
2010-01-18  3:48         ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).