* Re: s2disk and raid
2007-04-04 5:20 ` Neil Brown
@ 2007-04-04 18:53 ` Tim Dijkstra
2007-04-04 20:47 ` Michael Tokarev
2007-04-06 9:08 ` Luca Berra
2 siblings, 0 replies; 9+ messages in thread
From: Tim Dijkstra @ 2007-04-04 18:53 UTC (permalink / raw)
To: Neil Brown, suspend-devel List, linux-raid, 415441
[-- Attachment #1.1: Type: text/plain, Size: 3371 bytes --]
On Wed, 4 Apr 2007 15:20:56 +1000
Neil Brown <neilb@suse.de> wrote:
> On Tuesday April 3, newsuser@famdijkstra.org wrote:
> > Hi,
> >
> > I've got a bugreport [0] from a user trying to use raid and uswsusp. He's
> > using initramfs-tools available in debian. I'll describe the problem
> > and my analysis, maybe you can comment on what you think. A warning: I only
> > have a casual understanding of raid, never looked at any code related to it.
> >
> > This is a setup where root maybe on raid, but swap isn't. Swap on raid
> > will be very difficult to support, I think.
>
> Nah... shouldn't be a problem.... well, maybe raid5.
OK, that is nice to hear.
> >
> > When s2disk is started, nothing special is done to the array. It may be
> > in an unclean state (just like filesystems). Image is written to disk.
> >
> > After the power cycle the kernel boots, devices are discovered, among
> > which the ones holding raid. Then we try to find the device that holds
> > swap in case of resume and / in case of a normal boot.
> >
> > Now comes a crucial point. The script that finds the raid array, finds
> > the array in an unclean state and starts syncing.
>
> Uhm, so you are finding the device for the root filesystem before you
> have decided which case it will be (resume or normal boot). Can that
> be delayed until after the decision. It's probably not important but
> it seems neater.
> Or do you need the root device even when resuming (I guess if swap is
> in a file on the root filesystem....)
It is not that we need the root filesystem for resume. It is more how
the initramfs is currently setup. To be as general as possible, all
partitions are discoverd, of which one will contain the image.
> The trick is to use the 'start_ro' module parameter.
> echo 1 > /sys/module/md_mod/parameters/start_ro
>
> Then md will start arrays assuming read-only. No resync will be
> started, no superblock will be written. They stay this way until the
> first write at which point they become normal read-write and any
> required resync starts.
>
> So you can start arrays 'readonly', and resume off a raid1 without any
> risk of the the resync starting when it shouldn't.
>
> It is probably best to 'echo 0 > ....' once you have committed to a
> normal boot, but it isn't really critical.
This is very good to know. I think we can work out something with the
debian-maintainer based on this.
> >
> > The debian-maintainer of mdadm thinks that the suspend process should
> > have left the array in a clean state, but this is IMHO impossible.
>
> It probably would be best if suspend left the process in a clean
> state. It shouldn't be too hard, but it needs to be done in the
> kernel.
> However it isn't critical to all of this working well.
>
> I mentioned above that if swap in on raid5 it might be awkward. This
> is because raid5 caches some data that is on disk. If you snapshot
> the raid5 memory, then resume raid5 so it can write to disk, when you
> come back from suspend you could have old data in the cache. It
> should be possible to fix this, but it is currently a potential
> problem that might be worth warning people against.
OK, if we can support suspend and raid and even with swap on raid0 or
raid1, I'm happy.
Thanks for the input.
grts Tim
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 345 bytes --]
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
[-- Attachment #3: Type: text/plain, Size: 170 bytes --]
_______________________________________________
Suspend-devel mailing list
Suspend-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/suspend-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: s2disk and raid
2007-04-04 5:20 ` Neil Brown
2007-04-04 18:53 ` Tim Dijkstra
@ 2007-04-04 20:47 ` Michael Tokarev
2007-04-12 5:37 ` Luis Rodrigo Gallardo Cruz
2007-04-06 9:08 ` Luca Berra
2 siblings, 1 reply; 9+ messages in thread
From: Michael Tokarev @ 2007-04-04 20:47 UTC (permalink / raw)
To: Neil Brown; +Cc: Tim Dijkstra, suspend-devel List, linux-raid, 415441
Neil Brown wrote:
> On Tuesday April 3, newsuser@famdijkstra.org wrote:
[]
>> After the power cycle the kernel boots, devices are discovered, among
>> which the ones holding raid. Then we try to find the device that holds
>> swap in case of resume and / in case of a normal boot.
>>
>> Now comes a crucial point. The script that finds the raid array, finds
>> the array in an unclean state and starts syncing.
[]
> So you can start arrays 'readonly', and resume off a raid1 without any
> risk of the the resync starting when it shouldn't.
But I wonder why this raid is necessary in the first place.
For raid1, assuming the superblock is at the end, -- the only
thing needed for resume is one component of the mirror. I.e,
if your raid array is (was) composed off hda1 and hdb1, either
of the two will do as source of resume image. The trick is to
find which, in case the array was degraded -- and mdadm does the
job here, but assembling it isn't really necessary. Maybe mdadm
can be told to "examine" the component devices and write a short
line to stdout *instead* of real assembly (like mdadm -A --dummy),
to show the most recent component, and the offset if superblock
is at the beginning... having that, it will be possible to resume
from that component directly...
By the way, my home-grown initramfs stuff accepts several devices
for resume= command line, and tries each in turn. If main disks
has more-or-less stable names, this may be an alternative way.
To mean, just give the component devices in resume= line...
Yes, this way it may do some weird things in case when the original
swap array was degraded (with first component, which contained a
valid resume image, removed from the array)... But it's not really
a big issue, since - usually anyway - if one uses resume=, it means
the machine in question isn't some remote 100-miles-away, but it's
here, and it's ok to bypass the resume for recovery purposes.
Just some random thoughts.
/mjt
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: s2disk and raid
2007-04-04 20:47 ` Michael Tokarev
@ 2007-04-12 5:37 ` Luis Rodrigo Gallardo Cruz
2007-04-17 18:58 ` Bug#415441: " Tim Dijkstra
0 siblings, 1 reply; 9+ messages in thread
From: Luis Rodrigo Gallardo Cruz @ 2007-04-12 5:37 UTC (permalink / raw)
To: Michael Tokarev; +Cc: Neil Brown, suspend-devel List, linux-raid, 415441
[-- Attachment #1.1: Type: text/plain, Size: 1794 bytes --]
[I'm the original bug reporter. Sorry for getting so late into the
conversation]
On Thu, Apr 05, 2007 at 12:47:49AM +0400, Michael Tokarev wrote:
> Neil Brown wrote:
> > On Tuesday April 3, newsuser@famdijkstra.org wrote:
> []
> >> After the power cycle the kernel boots, devices are discovered, among
> >> which the ones holding raid. Then we try to find the device that holds
> >> swap in case of resume and / in case of a normal boot.
> >>
> >> Now comes a crucial point. The script that finds the raid array, finds
> >> the array in an unclean state and starts syncing.
> []
> > So you can start arrays 'readonly', and resume off a raid1 without any
> > risk of the the resync starting when it shouldn't.
>
> But I wonder why this raid is necessary in the first place.
In the case of my original report, the array is not actually necesary,
since the resume image is in another (normal) partition. The array
gets resumed since the mdadm scripts run before the resume ones in the
initrd and they by default start *every* array in the system.
But at least the mdadm maintainer seems to think that having the
resume image in a raid device, or in an lvm logical volume inside a
raid device, or other such esoteric arangements, is an use case worth
supporting.
Something that I seem to not have said. It's not *all* arrays that are
unclean on reboot, just one (that is used as physical volume for
LVM. I don't know if that's relevant). Also worth mentioning is that
kernel space suspend on 2.6.17 did not have this problem (or didn't
show it in my system, anyways).
After reading through the responses, I have come to think this is a
kernel issue, and have posted a report (#418823) to debian's linux-2.6
package. I'll wait to see what they have to say.
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 345 bytes --]
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
[-- Attachment #3: Type: text/plain, Size: 170 bytes --]
_______________________________________________
Suspend-devel mailing list
Suspend-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/suspend-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Bug#415441: s2disk and raid
2007-04-12 5:37 ` Luis Rodrigo Gallardo Cruz
@ 2007-04-17 18:58 ` Tim Dijkstra
0 siblings, 0 replies; 9+ messages in thread
From: Tim Dijkstra @ 2007-04-17 18:58 UTC (permalink / raw)
To: Luis Rodrigo Gallardo Cruz, 415441
Cc: Neil Brown, suspend-devel List, Michael Tokarev, linux-raid
[-- Attachment #1.1: Type: text/plain, Size: 1540 bytes --]
On Thu, 12 Apr 2007 00:37:53 -0500
Luis Rodrigo Gallardo Cruz <rodrigo@nul-unu.com> wrote:
> On Thu, Apr 05, 2007 at 12:47:49AM +0400, Michael Tokarev wrote:
> > Neil Brown wrote:
> > > On Tuesday April 3, newsuser@famdijkstra.org wrote:
> > []
> > >> After the power cycle the kernel boots, devices are discovered, among
> > >> which the ones holding raid. Then we try to find the device that holds
> > >> swap in case of resume and / in case of a normal boot.
> > >>
> > >> Now comes a crucial point. The script that finds the raid array, finds
> > >> the array in an unclean state and starts syncing.
> > []
> > > So you can start arrays 'readonly', and resume off a raid1 without any
> > > risk of the the resync starting when it shouldn't.
> >
> Something that I seem to not have said. It's not *all* arrays that are
> unclean on reboot, just one (that is used as physical volume for
> LVM. I don't know if that's relevant). Also worth mentioning is that
> kernel space suspend on 2.6.17 did not have this problem (or didn't
> show it in my system, anyways).
>
> After reading through the responses, I have come to think this is a
> kernel issue, and have posted a report (#418823) to debian's linux-2.6
> package. I'll wait to see what they have to say.
Maybe there is a kernel issue, but we still are doing something wrong;
We shouldn't try to write to raid before we resume, that is just asking
for problems.
I'll look into the `readonly' option. That would fix or problem IMHO.
grts Tim
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 286 bytes --]
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
[-- Attachment #3: Type: text/plain, Size: 170 bytes --]
_______________________________________________
Suspend-devel mailing list
Suspend-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/suspend-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: s2disk and raid
2007-04-04 5:20 ` Neil Brown
2007-04-04 18:53 ` Tim Dijkstra
2007-04-04 20:47 ` Michael Tokarev
@ 2007-04-06 9:08 ` Luca Berra
2 siblings, 0 replies; 9+ messages in thread
From: Luca Berra @ 2007-04-06 9:08 UTC (permalink / raw)
To: linux-raid
On Wed, Apr 04, 2007 at 03:20:56PM +1000, Neil Brown wrote:
>The trick is to use the 'start_ro' module parameter.
> echo 1 > /sys/module/md_mod/parameters/start_ro
>
>Then md will start arrays assuming read-only. No resync will be
>started, no superblock will be written. They stay this way until the
>first write at which point they become normal read-write and any
>required resync starts.
>
uh, i tought a read-only array was supposed to remain read-only, and
that write attempts would fail.
My bad for not testing my assumptions.
L.
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 9+ messages in thread