MD software RAID1 vs suspend-to-disk

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* MD software RAID1 vs suspend-to-disk
@ 2009-03-01  8:52 Daniel Pittman
  2009-03-02  0:42 ` John Robinson
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Pittman @ 2009-03-01  8:52 UTC (permalink / raw)
  To: linux-raid

G'day.

I have a random desktop machine here, running Debian/sid with a 2.6.26
Debian kernel.  It has a two disk software RAID1, and apparently passes
through a suspend/resume cycle correctly, but...

The but is that while booting for resume it warned (in the initrd) that
the RAID array was unclean, and that it would be rebuilt.

After resuming, however, the array was listed as clean, but (damn) it
wasn't; checking the array[1] reported that there were 48800 errors, and
a repair claimed to fix them.

That makes me suspect that something went wrong with shutting down the
array during the suspend process — given it is the array with / mounted
it could still be busy, and possibly unclean.

Then, resuming detects that, starts to correct it, switches back to the
previous kernel and ... viola, the saved "clean" state is restored,
unaware that the array was out of sync or that anything changed under
it.

I don't know quite enough about the suspend/resume implementation to
know if this is a problem, or just likely to be, or some quirk of this
system.

It does concern me, though, so: should I expect suspend on MD RAID1 to
work, cleanly, in all cases?

Regards,
        Daniel

Footnotes: 
[1]  echo check > /sys/...

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: MD software RAID1 vs suspend-to-disk
  2009-03-01  8:52 MD software RAID1 vs suspend-to-disk Daniel Pittman
@ 2009-03-02  0:42 ` John Robinson
  2009-03-02  2:23   ` Daniel Pittman
  0 siblings, 1 reply; 5+ messages in thread
From: John Robinson @ 2009-03-02  0:42 UTC (permalink / raw)
  To: Daniel Pittman; +Cc: linux-raid

On 01/03/2009 08:52, Daniel Pittman wrote:
> I have a random desktop machine here, running Debian/sid with a 2.6.26
> Debian kernel.  It has a two disk software RAID1, and apparently passes
> through a suspend/resume cycle correctly, but...

I'm not sure if this is the same suspend/resume - there are after all 
several Googleable reasons why one might suspend or resume various 
things - but it might be worth a look at NeilB's recent post of a patch 
to "hopefully enable suspend/resume of md devices": 
http://marc.info/?l=linux-raid&m=123440845819870&w=2

Cheers,

John.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: MD software RAID1 vs suspend-to-disk
  2009-03-02  0:42 ` John Robinson
@ 2009-03-02  2:23   ` Daniel Pittman
  2009-03-02  2:58     ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Pittman @ 2009-03-02  2:23 UTC (permalink / raw)
  To: linux-raid

John Robinson <john.robinson@anonymous.org.uk> writes:
> On 01/03/2009 08:52, Daniel Pittman wrote:
>
>> I have a random desktop machine here, running Debian/sid with a 2.6.26
>> Debian kernel.  It has a two disk software RAID1, and apparently passes
>> through a suspend/resume cycle correctly, but...
>
> I'm not sure if this is the same suspend/resume - there are after all several
> Googleable reasons why one might suspend or resume various things - but it
> might be worth a look at NeilB's recent post of a patch to "hopefully enable
> suspend/resume of md devices":
> http://marc.info/?l=linux-raid&m=123440845819870&w=2

No, that appears to be about suspending and resuming access to the
MD device while reconfiguring it; I don't /think/ that is accessed
during a system-wide suspend/resume (aka hibernate, or s2disk) cycle.

Certainly, it doesn't look like the path is invoked for that from my
reading of the code.

Regards,
        Daniel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: MD software RAID1 vs suspend-to-disk
  2009-03-02  2:23   ` Daniel Pittman
@ 2009-03-02  2:58     ` NeilBrown
  2009-03-02  3:40       ` Daniel Pittman
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2009-03-02  2:58 UTC (permalink / raw)
  To: Daniel Pittman; +Cc: linux-raid

On Mon, March 2, 2009 1:23 pm, Daniel Pittman wrote:
> John Robinson <john.robinson@anonymous.org.uk> writes:
>> On 01/03/2009 08:52, Daniel Pittman wrote:
>>
>>> I have a random desktop machine here, running Debian/sid with a 2.6.26
>>> Debian kernel.  It has a two disk software RAID1, and apparently passes
>>> through a suspend/resume cycle correctly, but...
>>
>> I'm not sure if this is the same suspend/resume - there are after all
>> several
>> Googleable reasons why one might suspend or resume various things - but
>> it
>> might be worth a look at NeilB's recent post of a patch to "hopefully
>> enable
>> suspend/resume of md devices":
>> http://marc.info/?l=linux-raid&m=123440845819870&w=2
>
> No, that appears to be about suspending and resuming access to the
> MD device while reconfiguring it; I don't /think/ that is accessed
> during a system-wide suspend/resume (aka hibernate, or s2disk) cycle.
>
> Certainly, it doesn't look like the path is invoked for that from my
> reading of the code.

Correct, they are completely unrelated.

I have never tried hibernating to an md array, but I think others have,
though I don't have a lot of specifics.

One observation is that you really don't want resync to start before the
resume has completed.
For this reason we have the  'start_ro' parameter.
Setting that to 1, e.g

  echo 1 > /sys/module/md_mod/parameters/start_ro

will mean that resync will not start until the first write to the array.
The initrd should set this before assembling an md array to load
a resume image from.

If it doesn't, it shouldn't cause major problems, but I'm not making
an promises.

It should be that your observed symtpom of "check reports 48800
mismatches" has nothing to do with hibernate/resume.

Presumably you have swap on md/raid1 (as that is where hibenate writes).
The nature of swap writeout is that it is entirely possible for different
data to be written to each device of a raid1 when a page is swapped out.
However in that case, the data will never be read back in so the
apparent corruption is not a problem.

I would recommend that you run 'repair' before hibernating, to be sure that
the array is in-sync.  Then hibenate/resume and see if it is still in
sync.  I suspect it will be.

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: MD software RAID1 vs suspend-to-disk
  2009-03-02  2:58     ` NeilBrown
@ 2009-03-02  3:40       ` Daniel Pittman
  0 siblings, 0 replies; 5+ messages in thread
From: Daniel Pittman @ 2009-03-02  3:40 UTC (permalink / raw)
  To: linux-raid

"NeilBrown" <neilb@suse.de> writes:
> On Mon, March 2, 2009 1:23 pm, Daniel Pittman wrote:
>> John Robinson <john.robinson@anonymous.org.uk> writes:
>>> On 01/03/2009 08:52, Daniel Pittman wrote:
>>>
>>>> I have a random desktop machine here, running Debian/sid with a
>>>> 2.6.26 Debian kernel.  It has a two disk software RAID1, and
>>>> apparently passes through a suspend/resume cycle correctly, but...

[...]

>> No, that appears to be about suspending and resuming access to the
>> MD device while reconfiguring it; I don't /think/ that is accessed
>> during a system-wide suspend/resume (aka hibernate, or s2disk) cycle.
>>
>> Certainly, it doesn't look like the path is invoked for that from my
>> reading of the code.
>
> Correct, they are completely unrelated.
>
> I have never tried hibernating to an md array, but I think others
> have, though I don't have a lot of specifics.
>
> One observation is that you really don't want resync to start before
> the resume has completed.  For this reason we have the 'start_ro'
> parameter.  Setting that to 1, e.g
>
>   echo 1 > /sys/module/md_mod/parameters/start_ro
>
> will mean that resync will not start until the first write to the
> array.  The initrd should set this before assembling an md array to
> load a resume image from.

Ah.  Debian already do this; see:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=415441
(Actually, since you wrote in that bug thread you already know. :)

Hmmm.  I have swap on LVM on MD, though, and I suspect that LVM writes
to disk when it discovers and activates the volume groups...

Let me try and find out.  Then I can go and be grumpy, but at least
complain to the right people about this. :)

[...]

> It should be that your observed symtpom of "check reports 48800
> mismatches" has nothing to do with hibernate/resume.

OK.

> Presumably you have swap on md/raid1 (as that is where hibenate
> writes).  The nature of swap writeout is that it is entirely possible
> for different data to be written to each device of a raid1 when a page
> is swapped out.
>
> However in that case, the data will never be read back in so the
> apparent corruption is not a problem.

Well, that is a relief, at least.

> I would recommend that you run 'repair' before hibernating, to be sure
> that the array is in-sync.  Then hibenate/resume and see if it is
> still in sync.  I suspect it will be.

That seems reasonable; I will test it.

Regards,
        Daniel


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-03-02  3:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-01  8:52 MD software RAID1 vs suspend-to-disk Daniel Pittman
2009-03-02  0:42 ` John Robinson
2009-03-02  2:23   ` Daniel Pittman
2009-03-02  2:58     ` NeilBrown
2009-03-02  3:40       ` Daniel Pittman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).