Questions regarding startup of imsm container

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Questions regarding startup of imsm container
@ 2010-03-23  3:56 Randy Terbush
  2010-03-23  8:04 ` [PATCH] (Re: Questions regarding startup of imsm container) Luca Berra
  2010-03-23 21:01 ` Questions regarding startup of imsm container Dan Williams
  0 siblings, 2 replies; 20+ messages in thread
From: Randy Terbush @ 2010-03-23  3:56 UTC (permalink / raw)
  To: linux-raid

Having a go at building a raid5 array using the new imsm support and
having good luck keeping drives in the array, etc. Nice work. I have a
few questions though as I am having some trouble figuring out how to
properly start this container.

# mdadm --version
mdadm - v3.1.2 - 10th March 2010

# mdadm -Es
ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c

# ls -l /dev/md/
total 0
lrwxrwxrwx 1 root root 6 Mar 22 20:54 0 -> ../md0
lrwxrwxrwx 1 root root 8 Mar 22 20:54 127 -> ../md127
lrwxrwxrwx 1 root root 8 Mar 22 20:54 Volume0_0 -> ../md127

As you can see, the name for the link in /dev/md does not agree with
the name that the Examine is coming up with.

Is it better to just forgo the ARRAY statements and go with an AUTO +imsm?

And last, does the concept of a write-intent bitmap make sense on an
imsm container? If so, I get a segv if trying to run mdadm /dev/mdX
-Gb internal on either device.

Thanks for your help

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] (Re: Questions regarding startup of imsm container)
  2010-03-23  3:56 Questions regarding startup of imsm container Randy Terbush
@ 2010-03-23  8:04 ` Luca Berra
  2010-03-23 12:58   ` Randy Terbush
                     ` (2 more replies)
  2010-03-23 21:01 ` Questions regarding startup of imsm container Dan Williams
  1 sibling, 3 replies; 20+ messages in thread
From: Luca Berra @ 2010-03-23  8:04 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb

[-- Attachment #1: Type: text/plain, Size: 1963 bytes --]

On Mon, Mar 22, 2010 at 09:56:01PM -0600, Randy Terbush wrote:
>Having a go at building a raid5 array using the new imsm support and
>having good luck keeping drives in the array, etc. Nice work. I have a
>few questions though as I am having some trouble figuring out how to
>properly start this container.
>
># mdadm --version
>mdadm - v3.1.2 - 10th March 2010
>
># mdadm -Es
>ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
>ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
>member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>
># ls -l /dev/md/
>total 0
>lrwxrwxrwx 1 root root 6 Mar 22 20:54 0 -> ../md0
>lrwxrwxrwx 1 root root 8 Mar 22 20:54 127 -> ../md127
>lrwxrwxrwx 1 root root 8 Mar 22 20:54 Volume0_0 -> ../md127
>
>As you can see, the name for the link in /dev/md does not agree with
>the name that the Examine is coming up with.
please read mdadm.conf manpage, under the section "HOMEHOST"

>Is it better to just forgo the ARRAY statements and go with an AUTO +imsm?
>
>And last, does the concept of a write-intent bitmap make sense on an
>imsm container? If so, I get a segv if trying to run mdadm /dev/mdX
>-Gb internal on either device.

i don't believe it makes sense at all, surely imsm do not support an
internal bitmap (no provisioning for it in the metadata)

The attached patch completely disables bitmap support for arrays with
externally managed metadata.

on a style note, i do not like having the struct superswitch, which is a
collection of function pointers which is then instanced with only some
of the pointers initialized, it forces having to check at runtime if
they are or not.
a possible solution would be to wrap every call of these into a macro
that check for NULL before, but how do you return the correct return
type from that?

L.

-- 
Luca Berra -- bluca@comedia.it
          Communication Media & Services S.r.l.
   /"\
   \ /     ASCII RIBBON CAMPAIGN
    X        AGAINST HTML MAIL
   / \

[-- Attachment #2: 0001-External-metadata-array-do-not-support-bitmaps.patch --]
[-- Type: text/plain, Size: 1913 bytes --]

From f970449a469f009d3f31703151652361acb8a41e Mon Sep 17 00:00:00 2001
From: Luca Berra <bluca@comedia.it>
Date: Tue, 23 Mar 2010 08:49:00 +0100
Subject: [PATCH] External metadata array do not support bitmaps

Signed-off-by: Luca Berra <bluca@comedia.it>
---
 Create.c |    6 ++++++
 Grow.c   |    6 ++++++
 bitmap.c |    5 +++++
 3 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/Create.c b/Create.c
index 909ac5d..ae905fc 100644
--- a/Create.c
+++ b/Create.c
@@ -206,6 +206,12 @@ int Create(struct supertype *st, char *mddev,
 		fprintf(stderr, Name ": You haven't given enough devices (real or missing) to create this array\n");
 		return 1;
 	}
+	if (st && st->ss->external && bitmap_file) {
+		fprintf(stderr,
+			Name ": This metadata type does not support "
+			"bitmaps\n");
+		return 1;
+	}
 	if (bitmap_file && level <= 0) {
 		fprintf(stderr, Name ": bitmaps not meaningful with level %s\n",
 			map_num(pers, level)?:"given");
diff --git a/Grow.c b/Grow.c
index 6264996..d736d27 100644
--- a/Grow.c
+++ b/Grow.c
@@ -283,6 +283,12 @@ int Grow_addbitmap(char *devname, int fd, char *file, int chunk, int delay, int
 			array.major_version, array.minor_version);
 		return 1;
 	}
+	if (st->ss->external) {
+		fprintf(stderr,
+			Name ": This metadata type does not support "
+			"bitmaps\n");
+		return 1;
+	}
 	if (strcmp(file, "none") == 0) {
 		fprintf(stderr, Name ": no bitmap found on %s\n", devname);
 		return 1;
diff --git a/bitmap.c b/bitmap.c
index 088e37d..ff63588 100644
--- a/bitmap.c
+++ b/bitmap.c
@@ -227,6 +227,11 @@ bitmap_info_t *bitmap_file_read(char *filename, int brief, struct supertype **st
 		if (!st) {
 			/* just look at device... */
 			lseek(fd, 0, 0);
+		} else if (st->ss->external) {
+			fprintf(stderr,
+				Name ": This metadata type does not support "
+				"bitmaps\n");
+			return NULL;
 		} else {
 			st->ss->locate_bitmap(st, fd);
 		}
-- 
1.7.0.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] (Re: Questions regarding startup of imsm container)
  2010-03-23  8:04 ` [PATCH] (Re: Questions regarding startup of imsm container) Luca Berra
@ 2010-03-23 12:58   ` Randy Terbush
  2010-03-23 14:22     ` Luca Berra
  2010-03-23 14:33     ` Randy Terbush
  2010-03-23 23:06   ` [PATCH] " Dan Williams
  2010-03-24  0:57   ` Neil Brown
  2 siblings, 2 replies; 20+ messages in thread
From: Randy Terbush @ 2010-03-23 12:58 UTC (permalink / raw)
  To: linux-raid, neilb

On Tue, Mar 23, 2010 at 2:04 AM, Luca Berra <bluca@comedia.it> wrote:
>> # mdadm --version
>> mdadm - v3.1.2 - 10th March 2010
>>
>> # mdadm -Es
>> ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
>> ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
>> member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>>
>> # ls -l /dev/md/
>> total 0
>> lrwxrwxrwx 1 root root 6 Mar 22 20:54 0 -> ../md0
>> lrwxrwxrwx 1 root root 8 Mar 22 20:54 127 -> ../md127
>> lrwxrwxrwx 1 root root 8 Mar 22 20:54 Volume0_0 -> ../md127
>>
>> As you can see, the name for the link in /dev/md does not agree with
>> the name that the Examine is coming up with.
>
> please read mdadm.conf manpage, under the section "HOMEHOST"

If I understand this correctly, I think there still may be a problem
as I am not clear on how I could have set the homehost in the metadata
for this imsm array. The Volume0 is provided by imsm and is configured
in the option ROM.

The underlying question here is should the ARRAY entry in mdadm.conf
be changed to reflect the on disk name of the device, or is the
startup process munging that entry when it processes mdadm.conf to
strip the _0.

I'll try setting HOMEHOST <ignore> to see if I am getting expected results.

I seem to have some problems with startup still as I have the
following entry where the container is now md127. Was md0 when
originally created.

# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md126 : active raid5 sdb[3] sdc[2] sdd[1] sde[0]
      2930280448 blocks super external:/md127/0 level 5, 64k chunk,
algorithm 0 [4/4] [UUUU]

md127 : inactive sde[3](S) sdb[2](S) sdc[1](S) sdd[0](S)
      9028 blocks super external:imsm

unused devices: <none>

I am also running into a problem where fsck will crash during boot on
the ext4 filesystems that this array contains. No problem running fsck
after the boot process has completed so have not seemed to find the
magic with order of startup for this device.

>
>> Is it better to just forgo the ARRAY statements and go with an AUTO +imsm?
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] (Re: Questions regarding startup of imsm container)
  2010-03-23 12:58   ` Randy Terbush
@ 2010-03-23 14:22     ` Luca Berra
  2010-03-23 14:33     ` Randy Terbush
  1 sibling, 0 replies; 20+ messages in thread
From: Luca Berra @ 2010-03-23 14:22 UTC (permalink / raw)
  To: linux-raid

On Tue, Mar 23, 2010 at 06:58:33AM -0600, Randy Terbush wrote:
>On Tue, Mar 23, 2010 at 2:04 AM, Luca Berra <bluca@comedia.it> wrote:
>>> # mdadm --version
>>> mdadm - v3.1.2 - 10th March 2010
>>>
>>> # mdadm -Es
>>> ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
>>> ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
>>> member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>>>
>>> # ls -l /dev/md/
>>> total 0
>>> lrwxrwxrwx 1 root root 6 Mar 22 20:54 0 -> ../md0
>>> lrwxrwxrwx 1 root root 8 Mar 22 20:54 127 -> ../md127
>>> lrwxrwxrwx 1 root root 8 Mar 22 20:54 Volume0_0 -> ../md127
>>>
>>> As you can see, the name for the link in /dev/md does not agree with
>>> the name that the Examine is coming up with.
>>
>> please read mdadm.conf manpage, under the section "HOMEHOST"
>
>If I understand this correctly, I think there still may be a problem
>as I am not clear on how I could have set the homehost in the metadata
>for this imsm array. The Volume0 is provided by imsm and is configured
>in the option ROM.
>
>The underlying question here is should the ARRAY entry in mdadm.conf
>be changed to reflect the on disk name of the device, or is the
>startup process munging that entry when it processes mdadm.conf to
>strip the _0.

As far as i understand there is no way to set the homehost into imsm
metadata, so auto_assembly will consider the array a foreign one and
append _x to it.

>I'll try setting HOMEHOST <ignore> to see if I am getting expected results.
>
>I seem to have some problems with startup still as I have the
>following entry where the container is now md127. Was md0 when
>originally created.

i believe this also is normal, since there is no place in imsm metadata
to store a persistent minor.

># cat /proc/mdstat
>Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
>md126 : active raid5 sdb[3] sdc[2] sdd[1] sde[0]
>      2930280448 blocks super external:/md127/0 level 5, 64k chunk,
>algorithm 0 [4/4] [UUUU]
>
>md127 : inactive sde[3](S) sdb[2](S) sdc[1](S) sdd[0](S)
>      9028 blocks super external:imsm
>
>unused devices: <none>

personally i did never care, since i mount all fs by uuid so auto
assembly works for me.
you could try defining the array in mdadm.conf, in this case it should
not be considered foreign any more and it will be named Volume0.

>I am also running into a problem where fsck will crash during boot on
>the ext4 filesystems that this array contains. No problem running fsck
>after the boot process has completed so have not seemed to find the
>magic with order of startup for this device.
no idea about that, sorry

L.


-- 
Luca Berra -- bluca@comedia.it
         Communication Media & Services S.r.l.
  /"\
  \ /     ASCII RIBBON CAMPAIGN
   X        AGAINST HTML MAIL
  / \

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: (Re: Questions regarding startup of imsm container)
  2010-03-23 12:58   ` Randy Terbush
  2010-03-23 14:22     ` Luca Berra
@ 2010-03-23 14:33     ` Randy Terbush
  2010-03-23 14:49       ` Randy Terbush
                         ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Randy Terbush @ 2010-03-23 14:33 UTC (permalink / raw)
  To: linux-raid

To follow-up this startup challenge... here is what I am getting.

mdraid is being started with mdadm -As

I have the following in mdadm.conf

HOMEHOST Volume0
#DEVICE /dev/sd[bcde]
AUTO +imsm hifi:0 -all
ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c

The following devices are being created.

# ls -l /dev/md/
total 0
lrwxrwxrwx 1 root root 6 Mar 23 08:10 0 -> ../md0
lrwxrwxrwx 1 root root 8 Mar 23 08:17 126 -> ../md126
lrwxrwxrwx 1 root root 8 Mar 23 08:17 127 -> ../md127
lrwxrwxrwx 1 root root 8 Mar 23 08:17 imsm0 -> ../md127
lrwxrwxrwx 1 root root 8 Mar 23 08:17 Volume0 -> ../md126

cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md126 : active raid5 sdb[3] sdc[2] sdd[1] sde[0]
      2930280448 blocks super external:/md127/0 level 5, 64k chunk,
algorithm 0 [4/4] [UUUU]
      [>....................]  resync =  1.8% (18285824/976760320)
finish=182.6min speed=87464K/sec

md127 : inactive sde[3](S) sdb[2](S) sdc[1](S) sdd[0](S)
      9028 blocks super external:imsm

unused devices: <none>

So the container device is getting moved from md0 to md127. Not sure why.

And would sure like to have a write-intent bitmap active to avoid this
resync issue which seems to be happening way too frequently.


On Tue, Mar 23, 2010 at 6:58 AM, Randy Terbush <randy@terbush.org> wrote:
> On Tue, Mar 23, 2010 at 2:04 AM, Luca Berra <bluca@comedia.it> wrote:
>>> # mdadm --version
>>> mdadm - v3.1.2 - 10th March 2010
>>>
>>> # mdadm -Es
>>> ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
>>> ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
>>> member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>>>
>>> # ls -l /dev/md/
>>> total 0
>>> lrwxrwxrwx 1 root root 6 Mar 22 20:54 0 -> ../md0
>>> lrwxrwxrwx 1 root root 8 Mar 22 20:54 127 -> ../md127
>>> lrwxrwxrwx 1 root root 8 Mar 22 20:54 Volume0_0 -> ../md127
>>>
>>> As you can see, the name for the link in /dev/md does not agree with
>>> the name that the Examine is coming up with.
>>
>> please read mdadm.conf manpage, under the section "HOMEHOST"
>
> If I understand this correctly, I think there still may be a problem
> as I am not clear on how I could have set the homehost in the metadata
> for this imsm array. The Volume0 is provided by imsm and is configured
> in the option ROM.
>
> The underlying question here is should the ARRAY entry in mdadm.conf
> be changed to reflect the on disk name of the device, or is the
> startup process munging that entry when it processes mdadm.conf to
> strip the _0.
>
> I'll try setting HOMEHOST <ignore> to see if I am getting expected results.
>
> I seem to have some problems with startup still as I have the
> following entry where the container is now md127. Was md0 when
> originally created.
>
> # cat /proc/mdstat
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> md126 : active raid5 sdb[3] sdc[2] sdd[1] sde[0]
>      2930280448 blocks super external:/md127/0 level 5, 64k chunk,
> algorithm 0 [4/4] [UUUU]
>
> md127 : inactive sde[3](S) sdb[2](S) sdc[1](S) sdd[0](S)
>      9028 blocks super external:imsm
>
> unused devices: <none>
>
> I am also running into a problem where fsck will crash during boot on
> the ext4 filesystems that this array contains. No problem running fsck
> after the boot process has completed so have not seemed to find the
> magic with order of startup for this device.
>
>
>>
>>> Is it better to just forgo the ARRAY statements and go with an AUTO +imsm?
>>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: (Re: Questions regarding startup of imsm container)
  2010-03-23 14:33     ` Randy Terbush
@ 2010-03-23 14:49       ` Randy Terbush
  2010-03-23 15:56       ` Luca Berra
  2010-03-23 22:41       ` Dan Williams
  2 siblings, 0 replies; 20+ messages in thread
From: Randy Terbush @ 2010-03-23 14:49 UTC (permalink / raw)
  To: linux-raid

So more info:

I can assemble this array as expected by doing the following:

mdadm -A /dev/md0 /dev/sd[bcde]
mdadm -I /dev/md0

I get:
# ls -l /dev/md/
total 0
lrwxrwxrwx 1 root root 6 Mar 23 08:40 0 -> ../md0
lrwxrwxrwx 1 root root 8 Mar 23 08:40 127 -> ../md127
lrwxrwxrwx 1 root root 8 Mar 23 08:40 Volume0_0 -> ../md127

and:
# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md127 : active raid5 sdb[3] sdc[2] sdd[1] sde[0]
      2930280448 blocks super external:/md0/0 level 5, 64k chunk,
algorithm 0 [4/4] [UUUU]
      [=>...................]  resync =  5.8% (57270016/976760320)
finish=179.4min speed=85376K/sec

md0 : inactive sdb[3](S) sde[2](S) sdd[1](S) sdc[0](S)
      9028 blocks super external:imsm

unused devices: <none>

Not clear if this will force a resync every start...


On Tue, Mar 23, 2010 at 8:33 AM, Randy Terbush <randy@terbush.org> wrote:
> To follow-up this startup challenge... here is what I am getting.
>
> mdraid is being started with mdadm -As
>
> I have the following in mdadm.conf
>
> HOMEHOST Volume0
> #DEVICE /dev/sd[bcde]
> AUTO +imsm hifi:0 -all
> ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
> ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
> member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>
> The following devices are being created.
>
> # ls -l /dev/md/
> total 0
> lrwxrwxrwx 1 root root 6 Mar 23 08:10 0 -> ../md0
> lrwxrwxrwx 1 root root 8 Mar 23 08:17 126 -> ../md126
> lrwxrwxrwx 1 root root 8 Mar 23 08:17 127 -> ../md127
> lrwxrwxrwx 1 root root 8 Mar 23 08:17 imsm0 -> ../md127
> lrwxrwxrwx 1 root root 8 Mar 23 08:17 Volume0 -> ../md126
>
> cat /proc/mdstat
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> md126 : active raid5 sdb[3] sdc[2] sdd[1] sde[0]
>      2930280448 blocks super external:/md127/0 level 5, 64k chunk,
> algorithm 0 [4/4] [UUUU]
>      [>....................]  resync =  1.8% (18285824/976760320)
> finish=182.6min speed=87464K/sec
>
> md127 : inactive sde[3](S) sdb[2](S) sdc[1](S) sdd[0](S)
>      9028 blocks super external:imsm
>
> unused devices: <none>
>
> So the container device is getting moved from md0 to md127. Not sure why.
>
> And would sure like to have a write-intent bitmap active to avoid this
> resync issue which seems to be happening way too frequently.
>
>
> On Tue, Mar 23, 2010 at 6:58 AM, Randy Terbush <randy@terbush.org> wrote:
>> On Tue, Mar 23, 2010 at 2:04 AM, Luca Berra <bluca@comedia.it> wrote:
>>>> # mdadm --version
>>>> mdadm - v3.1.2 - 10th March 2010
>>>>
>>>> # mdadm -Es
>>>> ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
>>>> ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
>>>> member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>>>>
>>>> # ls -l /dev/md/
>>>> total 0
>>>> lrwxrwxrwx 1 root root 6 Mar 22 20:54 0 -> ../md0
>>>> lrwxrwxrwx 1 root root 8 Mar 22 20:54 127 -> ../md127
>>>> lrwxrwxrwx 1 root root 8 Mar 22 20:54 Volume0_0 -> ../md127
>>>>
>>>> As you can see, the name for the link in /dev/md does not agree with
>>>> the name that the Examine is coming up with.
>>>
>>> please read mdadm.conf manpage, under the section "HOMEHOST"
>>
>> If I understand this correctly, I think there still may be a problem
>> as I am not clear on how I could have set the homehost in the metadata
>> for this imsm array. The Volume0 is provided by imsm and is configured
>> in the option ROM.
>>
>> The underlying question here is should the ARRAY entry in mdadm.conf
>> be changed to reflect the on disk name of the device, or is the
>> startup process munging that entry when it processes mdadm.conf to
>> strip the _0.
>>
>> I'll try setting HOMEHOST <ignore> to see if I am getting expected results.
>>
>> I seem to have some problems with startup still as I have the
>> following entry where the container is now md127. Was md0 when
>> originally created.
>>
>> # cat /proc/mdstat
>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
>> md126 : active raid5 sdb[3] sdc[2] sdd[1] sde[0]
>>      2930280448 blocks super external:/md127/0 level 5, 64k chunk,
>> algorithm 0 [4/4] [UUUU]
>>
>> md127 : inactive sde[3](S) sdb[2](S) sdc[1](S) sdd[0](S)
>>      9028 blocks super external:imsm
>>
>> unused devices: <none>
>>
>> I am also running into a problem where fsck will crash during boot on
>> the ext4 filesystems that this array contains. No problem running fsck
>> after the boot process has completed so have not seemed to find the
>> magic with order of startup for this device.
>>
>>
>>>
>>>> Is it better to just forgo the ARRAY statements and go with an AUTO +imsm?
>>>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: (Re: Questions regarding startup of imsm container)
  2010-03-23 14:33     ` Randy Terbush
  2010-03-23 14:49       ` Randy Terbush
@ 2010-03-23 15:56       ` Luca Berra
  2010-03-23 22:41       ` Dan Williams
  2 siblings, 0 replies; 20+ messages in thread
From: Luca Berra @ 2010-03-23 15:56 UTC (permalink / raw)
  To: linux-raid

On Tue, Mar 23, 2010 at 08:33:50AM -0600, Randy Terbush wrote:
>To follow-up this startup challenge... here is what I am getting.
>
>mdraid is being started with mdadm -As
>
>I have the following in mdadm.conf
>
>HOMEHOST Volume0
i dont think it matters
>#DEVICE /dev/sd[bcde]
>AUTO +imsm hifi:0 -all
>ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
there is no name for this array, so it will auto-allocate one, try:
ARRAY /dev/md0 metadata=imsm .....
if you dont like the dynamic minor (just a guess)
>ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
>member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>
>The following devices are being created.
>
># ls -l /dev/md/
>total 0
>lrwxrwxrwx 1 root root 6 Mar 23 08:10 0 -> ../md0
>lrwxrwxrwx 1 root root 8 Mar 23 08:17 126 -> ../md126
>lrwxrwxrwx 1 root root 8 Mar 23 08:17 127 -> ../md127
>lrwxrwxrwx 1 root root 8 Mar 23 08:17 imsm0 -> ../md127
>lrwxrwxrwx 1 root root 8 Mar 23 08:17 Volume0 -> ../md126
>
>cat /proc/mdstat
>Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
>md126 : active raid5 sdb[3] sdc[2] sdd[1] sde[0]
>      2930280448 blocks super external:/md127/0 level 5, 64k chunk,
>algorithm 0 [4/4] [UUUU]
>      [>....................]  resync =  1.8% (18285824/976760320)
>finish=182.6min speed=87464K/sec
>
>md127 : inactive sde[3](S) sdb[2](S) sdc[1](S) sdd[0](S)
>      9028 blocks super external:imsm
>
>unused devices: <none>
>
>So the container device is getting moved from md0 to md127. Not sure why.
>
>And would sure like to have a write-intent bitmap active to avoid this
>resync issue which seems to be happening way too frequently.
As far as i understand the code (which might be not correct):
It is impossible to have internal bitmaps on imsm arrays,
no provisioning in the imsm metadata.
For external bitmaps, kernel code checks the mddev superblock event
counter and the external bitmap superblock event counter to see if they
match before activating the bitmap (you dont' want a bitmap containing
stale information) it is an u64 number, imsm has a u32 number that is
used for similar purpose. so it might be possible.
I have no clue if it will work or not tough. this is why in the patch i
sent previously i preferred disabling bitmaps completely.

Regards,
L.

-- 
Luca Berra -- bluca@comedia.it
         Communication Media & Services S.r.l.
  /"\
  \ /     ASCII RIBBON CAMPAIGN
   X        AGAINST HTML MAIL
  / \

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Questions regarding startup of imsm container
  2010-03-23  3:56 Questions regarding startup of imsm container Randy Terbush
  2010-03-23  8:04 ` [PATCH] (Re: Questions regarding startup of imsm container) Luca Berra
@ 2010-03-23 21:01 ` Dan Williams
  2010-03-23 21:41   ` Randy Terbush
  1 sibling, 1 reply; 20+ messages in thread
From: Dan Williams @ 2010-03-23 21:01 UTC (permalink / raw)
  To: Randy Terbush; +Cc: linux-raid

On Mon, Mar 22, 2010 at 8:56 PM, Randy Terbush <randy@terbush.org> wrote:
> Having a go at building a raid5 array using the new imsm support and
> having good luck keeping drives in the array, etc. Nice work. I have a
> few questions though as I am having some trouble figuring out how to
> properly start this container.
>
> # mdadm --version
> mdadm - v3.1.2 - 10th March 2010
>
> # mdadm -Es
> ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
> ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
> member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>
> # ls -l /dev/md/
> total 0
> lrwxrwxrwx 1 root root 6 Mar 22 20:54 0 -> ../md0
> lrwxrwxrwx 1 root root 8 Mar 22 20:54 127 -> ../md127
> lrwxrwxrwx 1 root root 8 Mar 22 20:54 Volume0_0 -> ../md127
>
> As you can see, the name for the link in /dev/md does not agree with
> the name that the Examine is coming up with.

It looks like your array was auto-assembled without the help of a
configuration file (what distribution are you using?).  When there is
no configuration file mdadm will append _<num> to indicate that this
might be a foreign array (i.e. an array from another system).  The
only way the imsm code knows that the array is local is by having an
up to date mdadm.conf file with all your local arrays listed.

If you add the following lines to your configuration file:
ARRAY /dev/md0 UUID=30223250:76fd248b:50280919:0836b7f0
ARRAY /dev/md/Volume0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c

You should get the expected name.  Note that the arrays might be being
assembled by your initramfs environment.  So you may need to re-run
mkinitrd after modifying mdadm.conf.

> Is it better to just forgo the ARRAY statements and go with an AUTO +imsm?

If you don't mind the naming discrepancy then you can keep your current setup.

> And last, does the concept of a write-intent bitmap make sense on an
> imsm container? If so, I get a segv if trying to run mdadm /dev/mdX
> -Gb internal on either device.

That should be disallowed for imsm rather than segfault, I'll take a
look at addressing that.  The current write-intent bitmap
implementation is only compatible with native md-metadata for an
internal bitmap.  However, you should still be able to use an external
bitmap with imsm.  The imsm internal equivalent "dirty-stripe journal"
is still on the to do list.

> Thanks for your help

Thanks for the report,
Dan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Questions regarding startup of imsm container
  2010-03-23 21:01 ` Questions regarding startup of imsm container Dan Williams
@ 2010-03-23 21:41   ` Randy Terbush
  2010-03-23 22:16     ` Dan Williams
  0 siblings, 1 reply; 20+ messages in thread
From: Randy Terbush @ 2010-03-23 21:41 UTC (permalink / raw)
  To: Dan Williams, linux raid

Thanks for the suggestions from Dan and others. I've managed to pin
the names of the raid devices. Getting closer to figuring out the
startup problem hopefully. kernel trace included below...

This is running on Gentoo with kernel 2.6.30.
Linux hifi 2.6.30-gentoo-r9 #1 SMP Mon Mar 22 08:25:58 MDT 2010 x86_64
Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz GenuineIntel GNU/Linux

Gentoo calls a bit of startup script that looks like this:

# Start software raid with mdadm (new school)
mdadm_conf="/etc/mdadm/mdadm.conf"
[ -e /etc/mdadm.conf ] && mdadm_conf="/etc/mdadm.conf"
if [ -x /sbin/mdadm -a -f "${mdadm_conf}" ] ; then
    devs=$(awk '/^[[:space:]]*ARRAY/ { print $2 }' "${mdadm_conf}")
    if [ -n "${devs}" ]; then
        ebegin "Starting up RAID devices"
        output=$(mdadm -As 2>&1)
        ret=$?
        [ ${ret} -ne 0 ] && echo "${output}"
        eend ${ret}
    fi
fi

that seems to exit without error

My mdadm.conf now looks like this:
ARRAY /dev/md/0 metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
ARRAY /dev/md/127 container=30223250:76fd248b:50280919:0836b7f0
member=0 bitmap=/boot/md/127-RAID5-bitmap
UUID=8a4ae452:da1e7832:70ecf895:eb58229c

rc-logging shows me this:

rc default logging started at Tue Mar 23 15:28:05 2010

 * Checking local filesystems  ...
/dev/sda3: clean, 720743/9502720 files, 8557843/37997741 blocks
/dev/sda1: clean, 45/26104 files, 26216/104388 blocks
fsck.ext4: Device or resource busy while trying to open /dev/mapper/vg0-home
Filesystem mounted or opened exclusively by another program?
fsck.ext4: Device or resource busy while trying to open /dev/mapper/vg0-svn
Filesystem mounted or opened exclusively by another program?
fsck.ext4: Device or resource busy while trying to open /dev/mapper/vg0-archive
Filesystem mounted or opened exclusively by another program?
fsck.ext4: Device or resource busy while trying to open /dev/mapper/vg0-media
Filesystem mounted or opened exclusively by another program?
 * Operational error
 [ !! ]
 * Remounting root filesystem read/write ...
 [ ok ]
 * Updating /etc/mtab ...
 [ ok ]
 * Mounting local filesystems ...
mount: /dev/mapper/vg0-home already mounted or /home busy
mount: /dev/mapper/vg0-svn already mounted or /svn busy
mount: /dev/mapper/vg0-archive already mounted or /archive busy
mount: /dev/mapper/vg0-media already mounted or /mediaslice busy
 * Some local filesystem failed to mount

---------------------------------------------------------------------------------------------------------------------------

And /proc/mdstat says:

# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md127 : active (read-only) raid5 sdb[3] sdc[2] sdd[1] sde[0]
      2930280448 blocks super external:/md0/0 level 5, 64k chunk,
algorithm 0 [4/4] [UUUU]
        resync=PENDING

md0 : inactive sde[3](S) sdb[2](S) sdc[1](S) sdd[0](S)
      9028 blocks super external:imsm

unused devices: <none>

----------------------------------------------------------------------------------------------------------------------------------------

And a look at the kernel messages from dmesg says:

[   16.300609] md: md0 stopped.
[   16.319715] md: bind<sdd>
[   16.319787] md: bind<sdc>
[   16.319842] md: bind<sdb>
[   16.319895] md: bind<sde>
[   16.343621] md: md127 stopped.
[   16.343794] md: bind<sde>
[   16.343884] md: bind<sdd>
[   16.343965] md: bind<sdc>
[   16.344053] md: bind<sdb>
[   16.345115] raid5: device sdb operational as raid disk 0
[   16.345117] raid5: device sdc operational as raid disk 1
[   16.345119] raid5: device sdd operational as raid disk 2
[   16.345120] raid5: device sde operational as raid disk 3
[   16.345552] raid5: allocated 4270kB for md127
[   16.345591] raid5: raid level 5 set md127 active with 4 out of 4
devices, algorithm 0
[   16.345594] RAID5 conf printout:
[   16.345596]  --- rd:4 wd:4
[   16.345599]  disk 0, o:1, dev:sdb
[   16.345601]  disk 1, o:1, dev:sdc
[   16.345603]  disk 2, o:1, dev:sdd
[   16.345606]  disk 3, o:1, dev:sde
[   16.346507] md127: detected capacity change from 0 to 3000607178752
[   16.346510]  md127: unknown partition table
[   21.506666] device-mapper: ioctl: 4.14.0-ioctl (2008-04-23)
initialised: dm-devel@redhat.com
[   22.918983] ------------[ cut here ]------------
[   22.918986] kernel BUG at drivers/md/md.c:6139!
[   22.918988] invalid opcode: 0000 [#1] SMP
[   22.918990] last sysfs file:
/sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/uevent
[   22.918992] CPU 0
[   22.918993] Modules linked in: dm_mod wanrouter sdladrv wctdm
dahdi_echocan_mg2 dahdi mxl5005s s5h1409 tuner_simple tuner_types
snd_hda_codec_analog cs5345 tuner cx18 dvb_core cx2341x snd_hda_intel
v4l2_common videodev snd_hda_codec nvidia(P) v4l1_compat
v4l2_compat_ioctl32 snd_hwdep ir_common snd_pcm snd_timer ir_core snd
snd_page_alloc tveeprom
[   22.919010] Pid: 2628, comm: fsck.ext4 Tainted: P
2.6.30-gentoo-r9 #1 P5K3 Deluxe
[   22.919012] RIP: 0010:[<ffffffff813c72b6>]  [<ffffffff813c72b6>]
md_write_start+0x22/0x14d
[   22.919019] RSP: 0018:ffff8802275799d8  EFLAGS: 00010246
[   22.919021] RAX: 0000000000000001 RBX: ffff88022c450400 RCX: 00000000000000ff
[   22.919023] RDX: 00000000c8040180 RSI: ffff88022bde1b40 RDI: ffff88022c450400
[   22.919025] RBP: ffff880227579a28 R08: 0000000000001000 R09: ffff88022d8f3e60
[   22.919027] R10: 000000102ad44690 R11: 7fffffffffffffff R12: ffff88022c450400
[   22.919029] R13: ffff88022c4d0000 R14: 00000000c8040180 R15: 000000000090007f
[   22.919032] FS:  00007f02a33fc760(0000) GS:ffff880028034000(0000)
knlGS:0000000000000000
[   22.919034] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   22.919036] CR2: 00007f02a249beca CR3: 00000002279b6000 CR4: 00000000000026e0
[   22.919038] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   22.919040] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   22.919042] Process fsck.ext4 (pid: 2628, threadinfo
ffff880227578000, task ffff88022a8257c0)
[   22.919044] Stack:
[   22.919044]  0000000000000004 ffff88022d8f3e60 0000000000000001
0000000000000010
[   22.919047]  ffff880227579a48 ffffffff810f3913 ffff880227579a18
ffff88022bde1b40
[   22.919049]  ffff88022c450400 ffff88022c4d0000 ffff880227579ae8
ffffffff813c1707
[   22.919052] Call Trace:
[   22.919054]  [<ffffffff810f3913>] ? bio_alloc_bioset+0x48/0xbd
[   22.919058]  [<ffffffff813c1707>] make_request+0x4b/0x61b
[   22.919061]  [<ffffffffa0a8f9b6>] ? __map_bio+0xad/0x10c [dm_mod]
[   22.919069]  [<ffffffffa0a90497>] ?
__split_and_process_bio+0x400/0x40f [dm_mod]
[   22.919075]  [<ffffffff813c74ed>] md_make_request+0xb6/0xf4
[   22.919078]  [<ffffffff812436ef>] ? __up_read+0x8d/0x96
[   22.919081]  [<ffffffff8106a163>] ? up_read+0x9/0xb
[   22.919084]  [<ffffffff8122fe9e>] generic_make_request+0x213/0x25d
[   22.919087]  [<ffffffff8104def0>] ? try_to_wake_up+0x28f/0x2a1
[   22.919091]  [<ffffffff8122ffb7>] submit_bio+0xcf/0xd8
[   22.919093]  [<ffffffff810ef62b>] submit_bh+0xdd/0x100
[   22.919096]  [<ffffffff810f1907>] __block_write_full_page+0x1d3/0x2ae
[   22.919099]  [<ffffffff810f3c75>] ? blkdev_get_block+0x0/0x5e
[   22.919101]  [<ffffffff810f1f34>] ? end_buffer_async_write+0x0/0x101
[   22.919104]  [<ffffffff810f1f34>] ? end_buffer_async_write+0x0/0x101
[   22.919106]  [<ffffffff810f3c75>] ? blkdev_get_block+0x0/0x5e
[   22.919109]  [<ffffffff810f1a66>] block_write_full_page_endio+0x84/0x91
[   22.919112]  [<ffffffff810f1a83>] block_write_full_page+0x10/0x12
[   22.919114]  [<ffffffff810f482f>] blkdev_writepage+0x13/0x15
[   22.919117]  [<ffffffff810a6ec1>] __writepage+0x12/0x2b
[   22.919120]  [<ffffffff810a74ee>] write_cache_pages+0x244/0x3c1
[   22.919123]  [<ffffffff810a6eaf>] ? __writepage+0x0/0x2b
[   22.919126]  [<ffffffff812615bd>] ? do_output_char+0x92/0x1bd
[   22.919129]  [<ffffffff81067220>] ? remove_wait_queue+0x40/0x45
[   22.919133]  [<ffffffff810a768a>] generic_writepages+0x1f/0x21
[   22.919136]  [<ffffffff810a76b4>] do_writepages+0x28/0x37
[   22.919138]  [<ffffffff810a1a6f>] __filemap_fdatawrite_range+0x4b/0x4d
[   22.919141]  [<ffffffff810a2233>] filemap_fdatawrite+0x1a/0x1c
[   22.919144]  [<ffffffff810ee3ad>] vfs_fsync+0x51/0xa7
[   22.919146]  [<ffffffff810ee432>] do_fsync+0x2f/0x44
[   22.919148]  [<ffffffff810ee464>] sys_fsync+0xb/0xf
[   22.919150]  [<ffffffff81026ac2>] system_call_fastpath+0x16/0x1b
[   22.919154] Code: 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 41 55 41
54 53 48 89 fb 48 83 ec 38 f6 46 20 01 0f 84 28 01 00 00 8b 47 38 83
f8 01 75 04 <0f> 0b eb fe 45 31 e4 83 f8 02 75 2a c7 47 38 00 00 00 00
f0 80
[   22.919170] RIP  [<ffffffff813c72b6>] md_write_start+0x22/0x14d
[   22.919173]  RSP <ffff8802275799d8>
[   22.919184] ---[ end trace f4d0f953e0b63e69 ]---
[   22.950663] ------------[ cut here ]------------
[   22.950665] kernel BUG at drivers/md/md.c:6139!
[   22.950667] invalid opcode: 0000 [#2] SMP
[   22.950668] last sysfs file:
/sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/uevent
[   22.950670] CPU 0
[   22.950671] Modules linked in: dm_mod wanrouter sdladrv wctdm
dahdi_echocan_mg2 dahdi mxl5005s s5h1409 tuner_simple tuner_types
snd_hda_codec_analog cs5345 tuner cx18 dvb_core cx2341x snd_hda_intel
v4l2_common videodev snd_hda_codec nvidia(P) v4l1_compat
v4l2_compat_ioctl32 snd_hwdep ir_common snd_pcm snd_timer ir_core snd
snd_page_alloc tveeprom
[   22.950686] Pid: 2629, comm: fsck.ext4 Tainted: P      D
2.6.30-gentoo-r9 #1 P5K3 Deluxe
[   22.950688] RIP: 0010:[<ffffffff813c72b6>]  [<ffffffff813c72b6>]
md_write_start+0x22/0x14d
[   22.950692] RSP: 0018:ffff8802279b59d8  EFLAGS: 00010246
[   22.950694] RAX: 0000000000000001 RBX: ffff88022c450400 RCX: 00000000000000ff
[   22.950696] RDX: 00000000da840180 RSI: ffff88022bde1a80 RDI: ffff88022c450400
[   22.950698] RBP: ffff8802279b5a28 R08: 0000000000001000 R09: ffff88022d9d9ee0
[   22.950700] R10: 000000102acbbb10 R11: 7fffffffffffffff R12: ffff88022c450400
[   22.950702] R13: ffff88022c4d0000 R14: 00000000da840180 R15: 000000000090007f
[   22.950705] FS:  00007f3acd410760(0000) GS:ffff880028034000(0000)
knlGS:0000000000000000
[   22.950707] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   22.950709] CR2: 00007f3acc4afeca CR3: 0000000227009000 CR4: 00000000000026e0
[   22.950711] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   22.950713] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   22.950715] Process fsck.ext4 (pid: 2629, threadinfo
ffff8802279b4000, task ffff88022a871d40)
[   22.950717] Stack:
[   22.950718]  0000000000000004 ffff88022d9d9ee0 0000000000000001
0000000000000010
[   22.950720]  ffff8802279b5a48 ffffffff810f3913 ffff8802279b5a18
ffff88022bde1a80
[   22.950722]  ffff88022c450400 ffff88022c4d0000 ffff8802279b5ae8
ffffffff813c1707
[   22.950725] Call Trace:
[   22.950726]  [<ffffffff810f3913>] ? bio_alloc_bioset+0x48/0xbd
[   22.950729]  [<ffffffff813c1707>] make_request+0x4b/0x61b
[   22.950732]  [<ffffffffa0a8f9b6>] ? __map_bio+0xad/0x10c [dm_mod]
[   22.950739]  [<ffffffffa0a90497>] ?
__split_and_process_bio+0x400/0x40f [dm_mod]
[   22.950745]  [<ffffffff813c74ed>] md_make_request+0xb6/0xf4
[   22.950748]  [<ffffffff812436ef>] ? __up_read+0x8d/0x96
[   22.950751]  [<ffffffff8106a163>] ? up_read+0x9/0xb
[   22.950753]  [<ffffffff8122fe9e>] generic_make_request+0x213/0x25d
[   22.950756]  [<ffffffff8104def0>] ? try_to_wake_up+0x28f/0x2a1
[   22.950759]  [<ffffffff8122ffb7>] submit_bio+0xcf/0xd8
[   22.950761]  [<ffffffff810ef62b>] submit_bh+0xdd/0x100
[   22.950764]  [<ffffffff810f1907>] __block_write_full_page+0x1d3/0x2ae
[   22.950766]  [<ffffffff810f3c75>] ? blkdev_get_block+0x0/0x5e
[   22.950769]  [<ffffffff810f1f34>] ? end_buffer_async_write+0x0/0x101
[   22.950772]  [<ffffffff810f1f34>] ? end_buffer_async_write+0x0/0x101
[   22.950774]  [<ffffffff810f3c75>] ? blkdev_get_block+0x0/0x5e
[   22.950777]  [<ffffffff810f1a66>] block_write_full_page_endio+0x84/0x91
[   22.950779]  [<ffffffff810f1a83>] block_write_full_page+0x10/0x12
[   22.950782]  [<ffffffff810f482f>] blkdev_writepage+0x13/0x15
[   22.950785]  [<ffffffff810a6ec1>] __writepage+0x12/0x2b
[   22.950787]  [<ffffffff810a74ee>] write_cache_pages+0x244/0x3c1
[   22.950790]  [<ffffffff810a6eaf>] ? __writepage+0x0/0x2b
[   22.950793]  [<ffffffff812615bd>] ? do_output_char+0x92/0x1bd
[   22.950796]  [<ffffffff81067220>] ? remove_wait_queue+0x40/0x45
[   22.950799]  [<ffffffff810a768a>] generic_writepages+0x1f/0x21
[   22.950801]  [<ffffffff810a76b4>] do_writepages+0x28/0x37
[   22.950804]  [<ffffffff810a1a6f>] __filemap_fdatawrite_range+0x4b/0x4d
[   22.950806]  [<ffffffff810a2233>] filemap_fdatawrite+0x1a/0x1c
[   22.950809]  [<ffffffff810ee3ad>] vfs_fsync+0x51/0xa7
[   22.950811]  [<ffffffff810ee432>] do_fsync+0x2f/0x44
[   22.950813]  [<ffffffff810ee464>] sys_fsync+0xb/0xf
[   22.950815]  [<ffffffff81026ac2>] system_call_fastpath+0x16/0x1b
[   22.950819] Code: 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 41 55 41
54 53 48 89 fb 48 83 ec 38 f6 46 20 01 0f 84 28 01 00 00 8b 47 38 83
f8 01 75 04 <0f> 0b eb fe 45 31 e4 83 f8 02 75 2a c7 47 38 00 00 00 00
f0 80
[   22.950834] RIP  [<ffffffff813c72b6>] md_write_start+0x22/0x14d
[   22.950837]  RSP <ffff8802279b59d8>
[   22.950839] ---[ end trace f4d0f953e0b63e6a ]---
[   22.990861] ------------[ cut here ]------------
[   22.990865] kernel BUG at drivers/md/md.c:6139!
[   22.990868] invalid opcode: 0000 [#3] SMP
[   22.990871] last sysfs file:
/sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/uevent
[   22.990874] CPU 3
[   22.990876] Modules linked in: dm_mod wanrouter sdladrv wctdm
dahdi_echocan_mg2 dahdi mxl5005s s5h1409 tuner_simple tuner_types
snd_hda_codec_analog cs5345 tuner cx18 dvb_core cx2341x snd_hda_intel
v4l2_common videodev snd_hda_codec nvidia(P) v4l1_compat
v4l2_compat_ioctl32 snd_hwdep ir_common snd_pcm snd_timer ir_core snd
snd_page_alloc tveeprom
[   22.990900] Pid: 2631, comm: fsck.ext4 Tainted: P      D
2.6.30-gentoo-r9 #1 P5K3 Deluxe
[   22.990903] RIP: 0010:[<ffffffff813c72b6>]  [<ffffffff813c72b6>]
md_write_start+0x22/0x14d
[   22.990913] RSP: 0018:ffff8802275799d8  EFLAGS: 00010246
[   22.990915] RAX: 0000000000000001 RBX: ffff88022c450400 RCX: 00000000000000ff
[   22.990919] RDX: 00000000e7040180 RSI: ffff88022c55be40 RDI: ffff88022c450400
[   22.990922] RBP: ffff880227579a28 R08: 0000000000001000 R09: ffff88022d877f20
[   22.990925] R10: 000000102acba990 R11: 7fffffffffffffff R12: ffff88022c450400
[   22.990928] R13: ffff88022c4d0000 R14: 00000000e7040180 R15: 000000000090007f
[   22.990931] FS:  00007f1d01181760(0000) GS:ffff880028082000(0000)
knlGS:0000000000000000
[   22.990934] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   22.990937] CR2: 00007f1d00220eca CR3: 00000002279b6000 CR4: 00000000000026e0
[   22.990940] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   22.990943] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   22.990952] Process fsck.ext4 (pid: 2631, threadinfo
ffff880227578000, task ffff88022a870ea0)
[   22.990954] Stack:
[   22.990955]  0000000000000004 ffff88022d877f20 0000000000000001
0000000000000010
[   22.990957]  ffff880227579a48 ffffffff810f3913 ffff880227579a18
ffff88022c55be40
[   22.990960]  ffff88022c450400 ffff88022c4d0000 ffff880227579ae8
ffffffff813c1707
[   22.990962] Call Trace:
[   22.990964]  [<ffffffff810f3913>] ? bio_alloc_bioset+0x48/0xbd
[   22.990968]  [<ffffffff813c1707>] make_request+0x4b/0x61b
[   22.990972]  [<ffffffffa0a8f9b6>] ? __map_bio+0xad/0x10c [dm_mod]
[   22.990979]  [<ffffffffa0a90497>] ?
__split_and_process_bio+0x400/0x40f [dm_mod]
[   22.990986]  [<ffffffff813c74ed>] md_make_request+0xb6/0xf4
[   22.990988]  [<ffffffff812436ef>] ? __up_read+0x8d/0x96
[   22.990992]  [<ffffffff8106a163>] ? up_read+0x9/0xb
[   22.990995]  [<ffffffff8122fe9e>] generic_make_request+0x213/0x25d
[   22.990998]  [<ffffffff810df3d1>] ? pollwake+0x40/0x42
[   22.991001]  [<ffffffff8122ffb7>] submit_bio+0xcf/0xd8
[   22.991003]  [<ffffffff810ef62b>] submit_bh+0xdd/0x100
[   22.991006]  [<ffffffff810f1907>] __block_write_full_page+0x1d3/0x2ae
[   22.991009]  [<ffffffff810f3c75>] ? blkdev_get_block+0x0/0x5e
[   22.991011]  [<ffffffff810f1f34>] ? end_buffer_async_write+0x0/0x101
[   22.991014]  [<ffffffff810f1f34>] ? end_buffer_async_write+0x0/0x101
[   22.991017]  [<ffffffff810f3c75>] ? blkdev_get_block+0x0/0x5e
[   22.991019]  [<ffffffff810f1a66>] block_write_full_page_endio+0x84/0x91
[   22.991022]  [<ffffffff810f1a83>] block_write_full_page+0x10/0x12
[   22.991024]  [<ffffffff810f482f>] blkdev_writepage+0x13/0x15
[   22.991027]  [<ffffffff810a6ec1>] __writepage+0x12/0x2b
[   22.991031]  [<ffffffff810a74ee>] write_cache_pages+0x244/0x3c1
[   22.991034]  [<ffffffff810a6eaf>] ? __writepage+0x0/0x2b
[   22.991036]  [<ffffffff812615bd>] ? do_output_char+0x92/0x1bd
[   22.991040]  [<ffffffff81067220>] ? remove_wait_queue+0x40/0x45
[   22.991044]  [<ffffffff810a768a>] generic_writepages+0x1f/0x21
[   22.991047]  [<ffffffff810a76b4>] do_writepages+0x28/0x37
[   22.991049]  [<ffffffff810a1a6f>] __filemap_fdatawrite_range+0x4b/0x4d
[   22.991052]  [<ffffffff810a2233>] filemap_fdatawrite+0x1a/0x1c
[   22.991054]  [<ffffffff810ee3ad>] vfs_fsync+0x51/0xa7
[   22.991057]  [<ffffffff810ee432>] do_fsync+0x2f/0x44
[   22.991059]  [<ffffffff810ee464>] sys_fsync+0xb/0xf
[   22.991061]  [<ffffffff81026ac2>] system_call_fastpath+0x16/0x1b
[   22.991066] Code: 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 41 55 41
54 53 48 89 fb 48 83 ec 38 f6 46 20 01 0f 84 28 01 00 00 8b 47 38 83
f8 01 75 04 <0f> 0b eb fe 45 31 e4 83 f8 02 75 2a c7 47 38 00 00 00 00
f0 80
[   22.991081] RIP  [<ffffffff813c72b6>] md_write_start+0x22/0x14d
[   22.991084]  RSP <ffff8802275799d8>
[   22.991087] ---[ end trace f4d0f953e0b63e6b ]---
[   23.022827] ------------[ cut here ]------------
[   23.023001] kernel BUG at drivers/md/md.c:6139!
[   23.023176] invalid opcode: 0000 [#4] SMP
[   23.023381] last sysfs file:
/sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/uevent
[   23.023381] CPU 3
[   23.023381] Modules linked in: dm_mod wanrouter sdladrv wctdm
dahdi_echocan_mg2 dahdi mxl5005s s5h1409 tuner_simple tuner_types
snd_hda_codec_analog cs5345 tuner cx18 dvb_core cx2341x snd_hda_intel
v4l2_common videodev snd_hda_codec nvidia(P) v4l1_compat
v4l2_compat_ioctl32 snd_hwdep ir_common snd_pcm snd_timer ir_core snd
snd_page_alloc tveeprom
[   23.023381] Pid: 2632, comm: fsck.ext4 Tainted: P      D
2.6.30-gentoo-r9 #1 P5K3 Deluxe
[   23.023381] RIP: 0010:[<ffffffff813c72b6>]  [<ffffffff813c72b6>]
md_write_start+0x22/0x14d
[   23.023381] RSP: 0018:ffff8802279b59d8  EFLAGS: 00010246
[   23.023381] RAX: 0000000000000001 RBX: ffff88022c450400 RCX: 00000000000000ff
[   23.023381] RDX: 000000005d840180 RSI: ffff88022b56ecc0 RDI: ffff88022c450400
[   23.023381] RBP: ffff8802279b5a28 R08: 0000000000001000 R09: ffff88022780ae20
[   23.023381] R10: 000000102edd93b0 R11: 7fffffffffffffff R12: ffff88022c450400
[   23.023381] R13: ffff88022c4d0000 R14: 000000005d840180 R15: 000000000090007f
[   23.023381] FS:  00007f2604bcc760(0000) GS:ffff880028082000(0000)
knlGS:0000000000000000
[   23.023381] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   23.023381] CR2: 00007f2603c6beca CR3: 0000000227041000 CR4: 00000000000026e0
[   23.023381] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   23.023381] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   23.023381] Process fsck.ext4 (pid: 2632, threadinfo
ffff8802279b4000, task ffff88022d9195f0)
[   23.023381] Stack:
[   23.023381]  0000000000000004 ffff88022780ae20 0000000000000001
0000000000000010
[   23.023381]  ffff8802279b5a48 ffffffff810f3913 ffff8802279b5a18
ffff88022b56ecc0
[   23.023381]  ffff88022c450400 ffff88022c4d0000 ffff8802279b5ae8
ffffffff813c1707
[   23.023381] Call Trace:
[   23.023381]  [<ffffffff810f3913>] ? bio_alloc_bioset+0x48/0xbd
[   23.023381]  [<ffffffff813c1707>] make_request+0x4b/0x61b
[   23.023381]  [<ffffffffa0a8f9b6>] ? __map_bio+0xad/0x10c [dm_mod]
[   23.023381]  [<ffffffffa0a90497>] ?
__split_and_process_bio+0x400/0x40f [dm_mod]
[   23.023381]  [<ffffffff813c74ed>] md_make_request+0xb6/0xf4
[   23.023381]  [<ffffffff812436ef>] ? __up_read+0x8d/0x96
[   23.023381]  [<ffffffff8106a163>] ? up_read+0x9/0xb
[   23.023381]  [<ffffffff8122fe9e>] generic_make_request+0x213/0x25d
[   23.023381]  [<ffffffff8104def0>] ? try_to_wake_up+0x28f/0x2a1
[   23.023381]  [<ffffffff8122ffb7>] submit_bio+0xcf/0xd8
[   23.023381]  [<ffffffff810ef62b>] submit_bh+0xdd/0x100
[   23.023381]  [<ffffffff810f1907>] __block_write_full_page+0x1d3/0x2ae
[   23.023381]  [<ffffffff810f3c75>] ? blkdev_get_block+0x0/0x5e
[   23.023381]  [<ffffffff810f1f34>] ? end_buffer_async_write+0x0/0x101
[   23.023381]  [<ffffffff810f1f34>] ? end_buffer_async_write+0x0/0x101
[   23.023381]  [<ffffffff810f3c75>] ? blkdev_get_block+0x0/0x5e
[   23.023381]  [<ffffffff810f1a66>] block_write_full_page_endio+0x84/0x91
[   23.023381]  [<ffffffff810f1a83>] block_write_full_page+0x10/0x12
[   23.023381]  [<ffffffff810f482f>] blkdev_writepage+0x13/0x15
[   23.023381]  [<ffffffff810a6ec1>] __writepage+0x12/0x2b
[   23.023381]  [<ffffffff810a74ee>] write_cache_pages+0x244/0x3c1
[   23.023381]  [<ffffffff810a6eaf>] ? __writepage+0x0/0x2b
[   23.023381]  [<ffffffff812615bd>] ? do_output_char+0x92/0x1bd
[   23.023381]  [<ffffffff81067220>] ? remove_wait_queue+0x40/0x45
[   23.023381]  [<ffffffff810a768a>] generic_writepages+0x1f/0x21
[   23.023381]  [<ffffffff810a76b4>] do_writepages+0x28/0x37
[   23.023381]  [<ffffffff810a1a6f>] __filemap_fdatawrite_range+0x4b/0x4d
[   23.023381]  [<ffffffff810a2233>] filemap_fdatawrite+0x1a/0x1c
[   23.023381]  [<ffffffff810ee3ad>] vfs_fsync+0x51/0xa7
[   23.023381]  [<ffffffff810ee432>] do_fsync+0x2f/0x44
[   23.023381]  [<ffffffff810ee464>] sys_fsync+0xb/0xf
[   23.023381]  [<ffffffff81026ac2>] system_call_fastpath+0x16/0x1b
[   23.023381] Code: 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 41 55 41
54 53 48 89 fb 48 83 ec 38 f6 46 20 01 0f 84 28 01 00 00 8b 47 38 83
f8 01 75 04 <0f> 0b eb fe 45 31 e4 83 f8 02 75 2a c7 47 38 00 00 00 00
f0 80
[   23.023381] RIP  [<ffffffff813c72b6>] md_write_start+0x22/0x14d
[   23.023381]  RSP <ffff8802279b59d8>
[   23.040045] ---[ end trace f4d0f953e0b63e6c ]---

On Tue, Mar 23, 2010 at 3:01 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Mon, Mar 22, 2010 at 8:56 PM, Randy Terbush <randy@terbush.org> wrote:
>> Having a go at building a raid5 array using the new imsm support and
>> having good luck keeping drives in the array, etc. Nice work. I have a
>> few questions though as I am having some trouble figuring out how to
>> properly start this container.
>>
>> # mdadm --version
>> mdadm - v3.1.2 - 10th March 2010
>>
>> # mdadm -Es
>> ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
>> ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
>> member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>>
>> # ls -l /dev/md/
>> total 0
>> lrwxrwxrwx 1 root root 6 Mar 22 20:54 0 -> ../md0
>> lrwxrwxrwx 1 root root 8 Mar 22 20:54 127 -> ../md127
>> lrwxrwxrwx 1 root root 8 Mar 22 20:54 Volume0_0 -> ../md127
>>
>> As you can see, the name for the link in /dev/md does not agree with
>> the name that the Examine is coming up with.
>
> It looks like your array was auto-assembled without the help of a
> configuration file (what distribution are you using?).  When there is
> no configuration file mdadm will append _<num> to indicate that this
> might be a foreign array (i.e. an array from another system).  The
> only way the imsm code knows that the array is local is by having an
> up to date mdadm.conf file with all your local arrays listed.
>
> If you add the following lines to your configuration file:
> ARRAY /dev/md0 UUID=30223250:76fd248b:50280919:0836b7f0
> ARRAY /dev/md/Volume0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>
> You should get the expected name.  Note that the arrays might be being
> assembled by your initramfs environment.  So you may need to re-run
> mkinitrd after modifying mdadm.conf.
>
>> Is it better to just forgo the ARRAY statements and go with an AUTO +imsm?
>
> If you don't mind the naming discrepancy then you can keep your current setup.
>
>> And last, does the concept of a write-intent bitmap make sense on an
>> imsm container? If so, I get a segv if trying to run mdadm /dev/mdX
>> -Gb internal on either device.
>
> That should be disallowed for imsm rather than segfault, I'll take a
> look at addressing that.  The current write-intent bitmap
> implementation is only compatible with native md-metadata for an
> internal bitmap.  However, you should still be able to use an external
> bitmap with imsm.  The imsm internal equivalent "dirty-stripe journal"
> is still on the to do list.
>
>> Thanks for your help
>
> Thanks for the report,
> Dan
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Questions regarding startup of imsm container
  2010-03-23 21:41   ` Randy Terbush
@ 2010-03-23 22:16     ` Dan Williams
  2010-03-23 23:25       ` Randy Terbush
  0 siblings, 1 reply; 20+ messages in thread
From: Dan Williams @ 2010-03-23 22:16 UTC (permalink / raw)
  To: Randy Terbush; +Cc: linux raid

Randy Terbush wrote:
> Thanks for the suggestions from Dan and others. I've managed to pin
> the names of the raid devices. Getting closer to figuring out the
> startup problem hopefully. kernel trace included below...
> 
> This is running on Gentoo with kernel 2.6.30.
> Linux hifi 2.6.30-gentoo-r9 #1 SMP Mon Mar 22 08:25:58 MDT 2010 x86_64
> Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz GenuineIntel GNU/Linux
> 
[..]
> And /proc/mdstat says:
> 
> # cat /proc/mdstat
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> md127 : active (read-only) raid5 sdb[3] sdc[2] sdd[1] sde[0]
>       2930280448 blocks super external:/md0/0 level 5, 64k chunk,
> algorithm 0 [4/4] [UUUU]
>         resync=PENDING
> 
> md0 : inactive sde[3](S) sdb[2](S) sdc[1](S) sdd[0](S)
>       9028 blocks super external:imsm
> 
> unused devices: <none>
> 

This shows that Gentoo is most likely not including mdmon in their 
initramfs environment.  mdadm assembles the array readonly, but then 
mdmon is required to mark the array writable.

> And a look at the kernel messages from dmesg says:
[..]
> [   22.918983] ------------[ cut here ]------------
> [   22.918986] kernel BUG at drivers/md/md.c:6139!

I believe I hit this bug before and it came down to a mismatch between 
the readonly status of the array.  The block device was marked 
read-write according to blockdev --getro, but the internal md device 
state was readonly.  I believe this has been fixed upstream (but the 
commit escapes me), but would also be addressed by having mdmon 
available when the array is assembled.  It would be nice if Gentoo would 
adopt Dracut for their initramfs generation tool as it already 
comprehends the mdmon wrangling issues.

--
Dan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: (Re: Questions regarding startup of imsm container)
  2010-03-23 14:33     ` Randy Terbush
  2010-03-23 14:49       ` Randy Terbush
  2010-03-23 15:56       ` Luca Berra
@ 2010-03-23 22:41       ` Dan Williams
  2010-03-24 21:35         ` Randy Terbush
  2 siblings, 1 reply; 20+ messages in thread
From: Dan Williams @ 2010-03-23 22:41 UTC (permalink / raw)
  To: Randy Terbush; +Cc: linux-raid

On Tue, Mar 23, 2010 at 7:33 AM, Randy Terbush <randy@terbush.org> wrote:
> To follow-up this startup challenge... here is what I am getting.
>
> mdraid is being started with mdadm -As
>
> I have the following in mdadm.conf
>
> HOMEHOST Volume0
> #DEVICE /dev/sd[bcde]
> AUTO +imsm hifi:0 -all
> ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
> ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
> member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>
> The following devices are being created.
>
> # ls -l /dev/md/
> total 0
> lrwxrwxrwx 1 root root 6 Mar 23 08:10 0 -> ../md0
> lrwxrwxrwx 1 root root 8 Mar 23 08:17 126 -> ../md126
> lrwxrwxrwx 1 root root 8 Mar 23 08:17 127 -> ../md127
> lrwxrwxrwx 1 root root 8 Mar 23 08:17 imsm0 -> ../md127
> lrwxrwxrwx 1 root root 8 Mar 23 08:17 Volume0 -> ../md126
>
> cat /proc/mdstat
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> md126 : active raid5 sdb[3] sdc[2] sdd[1] sde[0]
>      2930280448 blocks super external:/md127/0 level 5, 64k chunk,
> algorithm 0 [4/4] [UUUU]
>      [>....................]  resync =  1.8% (18285824/976760320)
> finish=182.6min speed=87464K/sec
>
> md127 : inactive sde[3](S) sdb[2](S) sdc[1](S) sdd[0](S)
>      9028 blocks super external:imsm
>
> unused devices: <none>
>
> So the container device is getting moved from md0 to md127. Not sure why.

You didn't specify a device name for it in the configuration file so
mdadm picked one for you.

> And would sure like to have a write-intent bitmap active to avoid this
> resync issue which seems to be happening way too frequently.

This could also be a problem with your distribution not taking care of
mdmon properly at shutdown.  The shutdown scripts need to keep mdmon
alive over the final "remounting rootfs readonly" event and wait for
it to mark the array/metadata clean.  Otherwise there is a good chance
that the array will be left dirty and require a resync at startup.

Also note that recent versions of mdadm (3.1.2) and the kernel
(2.6.33) can checkpoint imsm resyncs so at least it will not start
over from the beginning when you reboot in the middle of a resync.

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] (Re: Questions regarding startup of imsm container)
  2010-03-23  8:04 ` [PATCH] (Re: Questions regarding startup of imsm container) Luca Berra
  2010-03-23 12:58   ` Randy Terbush
@ 2010-03-23 23:06   ` Dan Williams
  2010-03-24  0:57   ` Neil Brown
  2 siblings, 0 replies; 20+ messages in thread
From: Dan Williams @ 2010-03-23 23:06 UTC (permalink / raw)
  To: linux-raid, neilb

On Tue, Mar 23, 2010 at 1:04 AM, Luca Berra <bluca@comedia.it> wrote:
> The attached patch completely disables bitmap support for arrays with
> externally managed metadata.

It should be possible to use an external bitmap, but that requires
that you have storage separate from the raid array.

# mdadm --grow --bitmap=$(pwd)/test --force /dev/md125
# cat /proc/mdstat
Personalities : [raid1] [raid0]
md125 : active raid1 loop1[1] loop0[0]
      100143 blocks super external:/md127/0 [2/2] [UU]
      bitmap: 13/13 pages [52KB], 4KB chunk, file: /root/test

md127 : inactive loop1[1](S) loop0[0](S)
      418 blocks super external:imsm

unused devices: <none>


> on a style note, i do not like having the struct superswitch, which is a
> collection of function pointers which is then instanced with only some
> of the pointers initialized, it forces having to check at runtime if
> they are or not.

In some cases this is a 'feature' as it is an optional implementation
detail of the metadata format whether it supports, or wants to
override a given operation.  See ->default_layout() and
->detail_platform().  Another example is the small collection of
operations that are only applicable for mdmon to use.

I think something like this untested patch would be more appropriate
to fix the issue at hand.

diff --git a/bitmap.c b/bitmap.c
index 088e37d..054a507 100644
--- a/bitmap.c
+++ b/bitmap.c
@@ -227,6 +227,12 @@ bitmap_info_t *bitmap_file_read(char *filename,
int brief, struct supertype **st
                if (!st) {
                        /* just look at device... */
                        lseek(fd, 0, 0);
+               } else if (!st->ss->locate_bitmap) {
+                       fprintf(stderr, Name
+                               ": %s-metadata arrays do not support
an internal bitmap\n",
+                               st->ss->name);
+                       close(fd);
+                       return NULL;
                } else {
                        st->ss->locate_bitmap(st, fd);
                }


> a possible solution would be to wrap every call of these into a macro
> that check for NULL before, but how do you return the correct return
> type from that?

We need to fix up all the locations that assume md-metadata, once that
is done a macro is not needed.

--
Dan

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: Questions regarding startup of imsm container
  2010-03-23 22:16     ` Dan Williams
@ 2010-03-23 23:25       ` Randy Terbush
  2010-03-24  0:23         ` Randy Terbush
  0 siblings, 1 reply; 20+ messages in thread
From: Randy Terbush @ 2010-03-23 23:25 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux raid

Thanks Dan, a few more steps forward here. I suspect I know the
answer, but will see what you suggest.

On Tue, Mar 23, 2010 at 4:16 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> This shows that Gentoo is most likely not including mdmon in their initramfs
> environment.  mdadm assembles the array readonly, but then mdmon is required
> to mark the array writable.

Looks like you are correct and current installation packages on Gentoo
have apparently not dealt with these changes. mdmon is not getting
started and is not being attempted anywhere.

I don't run an initrd, so attempted to start mdmon after the mdadm -As
runs. This is apparently too early in the process as I get the
following:

* Starting up RAID devices ...
 [ ok ]
mdmon: Neither /var/run nor /lib/init/rw are writable
       cannot create .pid or .sock files.  Aborting
 * Setting up the Logical Volume Manager ...
 [ ok ]
 * Checking local filesystems  ...
HOME-vg0: clean, 13/3276800 files, 256151/52428800 blocks
Warning... fsck.ext4 for device /dev/mapper/vg0-home exited with signal 11.
SVN-vg0: clean, 191/1638400 files, 153087/26214400 blocks
Warning... fsck.ext4 for device /dev/mapper/vg0-svn exited with signal 11.
ARCHIVE-vg0: clean, 12/1638400 files, 152150/26214400 blocks

So it appears the start of mdmon needs to wait until we have a rw
filesystem mounted. Not entirely sure if it is related, but as you can
see above, fsck blows up trying to check the filesystems on this
array. That appears to clear itself up once mdmon is running. After
starting mdmon by hand, the resync begins and I can successfully run
fsck on these partitions.

So looks like I have a chicken and egg problem that I suspect may be
solved by creating an initramfs. I took a quick pass at dracut but
could not convince it to add mdmon. Any hints appreciated as I go back
to dig for more info.

Thanks again for the assistance.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Questions regarding startup of imsm container
  2010-03-23 23:25       ` Randy Terbush
@ 2010-03-24  0:23         ` Randy Terbush
  2010-03-24  4:14           ` Randy Terbush
  2010-03-24  5:54           ` Dan Williams
  0 siblings, 2 replies; 20+ messages in thread
From: Randy Terbush @ 2010-03-24  0:23 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux raid

Update on progress....

I have managed to hack together a dracut created initramfs that has me
able to boot and has resolved all boot issues. Will let this thing
have a few hours of resyncing and see if external bitmap comes back
and how it handles the reboot.

Thanks again for the assistance.

On Tue, Mar 23, 2010 at 5:25 PM, Randy Terbush <randy@terbush.org> wrote:
> Thanks Dan, a few more steps forward here. I suspect I know the
> answer, but will see what you suggest.
>
> On Tue, Mar 23, 2010 at 4:16 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>> This shows that Gentoo is most likely not including mdmon in their initramfs
>> environment.  mdadm assembles the array readonly, but then mdmon is required
>> to mark the array writable.
>
> Looks like you are correct and current installation packages on Gentoo
> have apparently not dealt with these changes. mdmon is not getting
> started and is not being attempted anywhere.
>
> I don't run an initrd, so attempted to start mdmon after the mdadm -As
> runs. This is apparently too early in the process as I get the
> following:
>
> * Starting up RAID devices ...
>  [ ok ]
> mdmon: Neither /var/run nor /lib/init/rw are writable
>       cannot create .pid or .sock files.  Aborting
>  * Setting up the Logical Volume Manager ...
>  [ ok ]
>  * Checking local filesystems  ...
> HOME-vg0: clean, 13/3276800 files, 256151/52428800 blocks
> Warning... fsck.ext4 for device /dev/mapper/vg0-home exited with signal 11.
> SVN-vg0: clean, 191/1638400 files, 153087/26214400 blocks
> Warning... fsck.ext4 for device /dev/mapper/vg0-svn exited with signal 11.
> ARCHIVE-vg0: clean, 12/1638400 files, 152150/26214400 blocks
>
> So it appears the start of mdmon needs to wait until we have a rw
> filesystem mounted. Not entirely sure if it is related, but as you can
> see above, fsck blows up trying to check the filesystems on this
> array. That appears to clear itself up once mdmon is running. After
> starting mdmon by hand, the resync begins and I can successfully run
> fsck on these partitions.
>
> So looks like I have a chicken and egg problem that I suspect may be
> solved by creating an initramfs. I took a quick pass at dracut but
> could not convince it to add mdmon. Any hints appreciated as I go back
> to dig for more info.
>
> Thanks again for the assistance.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] (Re: Questions regarding startup of imsm container)
  2010-03-23  8:04 ` [PATCH] (Re: Questions regarding startup of imsm container) Luca Berra
  2010-03-23 12:58   ` Randy Terbush
  2010-03-23 23:06   ` [PATCH] " Dan Williams
@ 2010-03-24  0:57   ` Neil Brown
  2010-03-24  6:12     ` Luca Berra
  2010-03-24 14:49     ` Dan Williams
  2 siblings, 2 replies; 20+ messages in thread
From: Neil Brown @ 2010-03-24  0:57 UTC (permalink / raw)
  To: Luca Berra; +Cc: linux-raid

On Tue, 23 Mar 2010 09:04:19 +0100
Luca Berra <bluca@comedia.it> wrote:

> On Mon, Mar 22, 2010 at 09:56:01PM -0600, Randy Terbush wrote:
> >Having a go at building a raid5 array using the new imsm support and
> >having good luck keeping drives in the array, etc. Nice work. I have a
> >few questions though as I am having some trouble figuring out how to
> >properly start this container.
> >
> ># mdadm --version
> >mdadm - v3.1.2 - 10th March 2010
> >
> ># mdadm -Es
> >ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
> >ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
> >member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
> >
> ># ls -l /dev/md/
> >total 0
> >lrwxrwxrwx 1 root root 6 Mar 22 20:54 0 -> ../md0
> >lrwxrwxrwx 1 root root 8 Mar 22 20:54 127 -> ../md127
> >lrwxrwxrwx 1 root root 8 Mar 22 20:54 Volume0_0 -> ../md127
> >
> >As you can see, the name for the link in /dev/md does not agree with
> >the name that the Examine is coming up with.
> please read mdadm.conf manpage, under the section "HOMEHOST"
> 
> >Is it better to just forgo the ARRAY statements and go with an AUTO +imsm?
> >
> >And last, does the concept of a write-intent bitmap make sense on an
> >imsm container? If so, I get a segv if trying to run mdadm /dev/mdX
> >-Gb internal on either device.
> 
> i don't believe it makes sense at all, surely imsm do not support an
> internal bitmap (no provisioning for it in the metadata)
> 
> The attached patch completely disables bitmap support for arrays with
> externally managed metadata.

Thanks for the patch.  However I would prefer to disable bitmap support for
those metadata formats which report that they don't support it.
Thus the following patch.

Thanks,
NeilBrown


diff --git a/Create.c b/Create.c
index 909ac5d..8f6e6e7 100644
--- a/Create.c
+++ b/Create.c
@@ -651,6 +651,11 @@ int Create(struct supertype *st, char *mddev,
 			fprintf(stderr, Name ": internal bitmaps not supported by this kernel.\n");
 			goto abort;
 		}
+		if (!st->ss->add_internal_bitmap) {
+			fprintf(stderr, Name ": internal bitmaps not supported with %s metadata\n",
+				st->ss->name);
+			goto abort;
+		}
 		if (!st->ss->add_internal_bitmap(st, &bitmap_chunk,
 						 delay, write_behind,
 						 bitmapsize, 1, major_num)) {
diff --git a/Grow.c b/Grow.c
index 6264996..053a372 100644
--- a/Grow.c
+++ b/Grow.c
@@ -288,6 +288,11 @@ int Grow_addbitmap(char *devname, int fd, char *file, int chunk, int delay, int
 		return 1;
 	} else if (strcmp(file, "internal") == 0) {
 		int d;
+		if (st->ss->add_internal_bitmap == NULL) {
+			fprintf(stderr, Name ": Internal bitmaps not supported "
+				"with %s metadata\n", st->ss->name);
+			return 1;
+		}
 		for (d=0; d< st->max_devs; d++) {
 			mdu_disk_info_t disk;
 			char *dv;
diff --git a/bitmap.c b/bitmap.c
index 088e37d..beef2dc 100644
--- a/bitmap.c
+++ b/bitmap.c
@@ -227,9 +227,13 @@ bitmap_info_t *bitmap_file_read(char *filename, int brief, struct supertype **st
 		if (!st) {
 			/* just look at device... */
 			lseek(fd, 0, 0);
-		} else {
+		} else if (!st->ss->locate_bitmap) {
+			fprintf(stderr, Name ": No bitmap possible with %s metadata\n",
+				st->ss->name);
+			return NULL;
+		} else
 			st->ss->locate_bitmap(st, fd);
-		}
+
 		ioctl(fd, BLKFLSBUF, 0); /* make sure we read current data */
 		*stp = st;
 	} else {

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: Questions regarding startup of imsm container
  2010-03-24  0:23         ` Randy Terbush
@ 2010-03-24  4:14           ` Randy Terbush
  2010-03-24  5:54           ` Dan Williams
  1 sibling, 0 replies; 20+ messages in thread
From: Randy Terbush @ 2010-03-24  4:14 UTC (permalink / raw)
  To: linux raid

The array is now surviving a reboot without dropping into a resync
when we start back up.

However, the external bitmap is not getting recreated. My mdadm.conf
looks like this:

ARRAY /dev/md/0 metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
ARRAY /dev/md/127 container=30223250:76fd248b:50280919:0836b7f0
member=0 bitmap=/boot/md/127-RAID5-bitmap
UUID=8a4ae452:da1e7832:70ecf895:eb58229c

The /boot partition is being mounted but may not be mounted at the
time the array is started. Not finding any error messages in logs
regarding the bitmap. The help is much appreciated.

On Tue, Mar 23, 2010 at 6:23 PM, Randy Terbush <randy@terbush.org> wrote:
> Update on progress....
>
> I have managed to hack together a dracut created initramfs that has me
> able to boot and has resolved all boot issues. Will let this thing
> have a few hours of resyncing and see if external bitmap comes back
> and how it handles the reboot.
>
> Thanks again for the assistance.
>
> On Tue, Mar 23, 2010 at 5:25 PM, Randy Terbush <randy@terbush.org> wrote:
>> Thanks Dan, a few more steps forward here. I suspect I know the
>> answer, but will see what you suggest.
>>
>> On Tue, Mar 23, 2010 at 4:16 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>>> This shows that Gentoo is most likely not including mdmon in their initramfs
>>> environment.  mdadm assembles the array readonly, but then mdmon is required
>>> to mark the array writable.
>>
>> Looks like you are correct and current installation packages on Gentoo
>> have apparently not dealt with these changes. mdmon is not getting
>> started and is not being attempted anywhere.
>>
>> I don't run an initrd, so attempted to start mdmon after the mdadm -As
>> runs. This is apparently too early in the process as I get the
>> following:
>>
>> * Starting up RAID devices ...
>>  [ ok ]
>> mdmon: Neither /var/run nor /lib/init/rw are writable
>>       cannot create .pid or .sock files.  Aborting
>>  * Setting up the Logical Volume Manager ...
>>  [ ok ]
>>  * Checking local filesystems  ...
>> HOME-vg0: clean, 13/3276800 files, 256151/52428800 blocks
>> Warning... fsck.ext4 for device /dev/mapper/vg0-home exited with signal 11.
>> SVN-vg0: clean, 191/1638400 files, 153087/26214400 blocks
>> Warning... fsck.ext4 for device /dev/mapper/vg0-svn exited with signal 11.
>> ARCHIVE-vg0: clean, 12/1638400 files, 152150/26214400 blocks
>>
>> So it appears the start of mdmon needs to wait until we have a rw
>> filesystem mounted. Not entirely sure if it is related, but as you can
>> see above, fsck blows up trying to check the filesystems on this
>> array. That appears to clear itself up once mdmon is running. After
>> starting mdmon by hand, the resync begins and I can successfully run
>> fsck on these partitions.
>>
>> So looks like I have a chicken and egg problem that I suspect may be
>> solved by creating an initramfs. I took a quick pass at dracut but
>> could not convince it to add mdmon. Any hints appreciated as I go back
>> to dig for more info.
>>
>> Thanks again for the assistance.
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Questions regarding startup of imsm container
  2010-03-24  0:23         ` Randy Terbush
  2010-03-24  4:14           ` Randy Terbush
@ 2010-03-24  5:54           ` Dan Williams
  1 sibling, 0 replies; 20+ messages in thread
From: Dan Williams @ 2010-03-24  5:54 UTC (permalink / raw)
  To: Randy Terbush; +Cc: linux raid

Randy Terbush wrote:
> Update on progress....
> 
> I have managed to hack together a dracut created initramfs that has me
> able to boot and has resolved all boot issues. Will let this thing
> have a few hours of resyncing and see if external bitmap comes back
> and how it handles the reboot.
> 

I expect it will still come back up and resync again because I doubt the 
initscripts and killall5 implementation in Gentoo comprehend the 
coordination that needs to happen with mdadm/mdmon at system shutdown.

The two necessary pieces for external metadata support in the system 
halt script are:
1/ killall5 needs to omit mdmon from the process shutdown step
2/ A call to mdadm --wait-clean --scan is needed to ensure that mdmon 
has had a chance to mark the array clean before reboot/shutdown.

Note that step 2 requires that mdadm can find the mdmon communication 
socket which lives (by default) in /var/run/mdadm.

--
Dan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] (Re: Questions regarding startup of imsm container)
  2010-03-24  0:57   ` Neil Brown
@ 2010-03-24  6:12     ` Luca Berra
  2010-03-24 14:49     ` Dan Williams
  1 sibling, 0 replies; 20+ messages in thread
From: Luca Berra @ 2010-03-24  6:12 UTC (permalink / raw)
  To: linux-raid

On Wed, Mar 24, 2010 at 11:57:49AM +1100, Neil Brown wrote:
>On Tue, 23 Mar 2010 09:04:19 +0100
>Thanks for the patch.  However I would prefer to disable bitmap support for
>those metadata formats which report that they don't support it.
>Thus the following patch.
It is more reasonable, as Dan confirmed external bitmaps should work,
which i was unsure at the beginning.

L.

-- 
Luca Berra -- bluca@comedia.it
         Communication Media & Services S.r.l.
  /"\
  \ /     ASCII RIBBON CAMPAIGN
   X        AGAINST HTML MAIL
  / \

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] (Re: Questions regarding startup of imsm container)
  2010-03-24  0:57   ` Neil Brown
  2010-03-24  6:12     ` Luca Berra
@ 2010-03-24 14:49     ` Dan Williams
  1 sibling, 0 replies; 20+ messages in thread
From: Dan Williams @ 2010-03-24 14:49 UTC (permalink / raw)
  To: Neil Brown; +Cc: Luca Berra, linux-raid

On Tue, Mar 23, 2010 at 5:57 PM, Neil Brown <neilb@suse.de> wrote:
> On Tue, 23 Mar 2010 09:04:19 +0100
> Luca Berra <bluca@comedia.it> wrote:
>
>> On Mon, Mar 22, 2010 at 09:56:01PM -0600, Randy Terbush wrote:
>> >Having a go at building a raid5 array using the new imsm support and
>> >having good luck keeping drives in the array, etc. Nice work. I have a
>> >few questions though as I am having some trouble figuring out how to
>> >properly start this container.
>> >
>> ># mdadm --version
>> >mdadm - v3.1.2 - 10th March 2010
>> >
>> ># mdadm -Es
>> >ARRAY metadata=imsm UUID=30223250:76fd248b:50280919:0836b7f0
>> >ARRAY /dev/md/Volume0 container=30223250:76fd248b:50280919:0836b7f0
>> >member=0 UUID=8a4ae452:da1e7832:70ecf895:eb58229c
>> >
>> ># ls -l /dev/md/
>> >total 0
>> >lrwxrwxrwx 1 root root 6 Mar 22 20:54 0 -> ../md0
>> >lrwxrwxrwx 1 root root 8 Mar 22 20:54 127 -> ../md127
>> >lrwxrwxrwx 1 root root 8 Mar 22 20:54 Volume0_0 -> ../md127
>> >
>> >As you can see, the name for the link in /dev/md does not agree with
>> >the name that the Examine is coming up with.
>> please read mdadm.conf manpage, under the section "HOMEHOST"
>>
>> >Is it better to just forgo the ARRAY statements and go with an AUTO +imsm?
>> >
>> >And last, does the concept of a write-intent bitmap make sense on an
>> >imsm container? If so, I get a segv if trying to run mdadm /dev/mdX
>> >-Gb internal on either device.
>>
>> i don't believe it makes sense at all, surely imsm do not support an
>> internal bitmap (no provisioning for it in the metadata)
>>
>> The attached patch completely disables bitmap support for arrays with
>> externally managed metadata.
>
> Thanks for the patch.  However I would prefer to disable bitmap support for
> those metadata formats which report that they don't support it.
> Thus the following patch.
>
> Thanks,
> NeilBrown
>
>
> +               if (!st->ss->add_internal_bitmap) {
[..]
> +               if (st->ss->add_internal_bitmap == NULL) {
[..]
> +               } else if (!st->ss->locate_bitmap) {

The smallest of nits, or maybe just a clarification.  I believe you
have said in the past that you prefer the readability of
positive-logic if statements where possible.

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: (Re: Questions regarding startup of imsm container)
  2010-03-23 22:41       ` Dan Williams
@ 2010-03-24 21:35         ` Randy Terbush
  0 siblings, 0 replies; 20+ messages in thread
From: Randy Terbush @ 2010-03-24 21:35 UTC (permalink / raw)
  To: Dan Williams, linux-raid

On Tue, Mar 23, 2010 at 4:41 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>> And would sure like to have a write-intent bitmap active to avoid this
>> resync issue which seems to be happening way too frequently.
>
> This could also be a problem with your distribution not taking care of
> mdmon properly at shutdown.  The shutdown scripts need to keep mdmon
> alive over the final "remounting rootfs readonly" event and wait for
> it to mark the array/metadata clean.  Otherwise there is a good chance
> that the array will be left dirty and require a resync at startup.
>
> Also note that recent versions of mdadm (3.1.2) and the kernel
> (2.6.33) can checkpoint imsm resyncs so at least it will not start
> over from the beginning when you reboot in the middle of a resync.

Does this functionality just work with the described version
combination, or is there something that needs to be done to activate
this behavior?

On a semi-related not, I believe I have worked out all of the startup
and shutdown issues and do have a imsm RAID configured that will now
survive a reboot without going int resync. However, I still am not
able to convince the array at startup to reconnect the external
write-intent bitmap. Is there some magic here that I am missing?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2010-03-24 21:35 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-23  3:56 Questions regarding startup of imsm container Randy Terbush
2010-03-23  8:04 ` [PATCH] (Re: Questions regarding startup of imsm container) Luca Berra
2010-03-23 12:58   ` Randy Terbush
2010-03-23 14:22     ` Luca Berra
2010-03-23 14:33     ` Randy Terbush
2010-03-23 14:49       ` Randy Terbush
2010-03-23 15:56       ` Luca Berra
2010-03-23 22:41       ` Dan Williams
2010-03-24 21:35         ` Randy Terbush
2010-03-23 23:06   ` [PATCH] " Dan Williams
2010-03-24  0:57   ` Neil Brown
2010-03-24  6:12     ` Luca Berra
2010-03-24 14:49     ` Dan Williams
2010-03-23 21:01 ` Questions regarding startup of imsm container Dan Williams
2010-03-23 21:41   ` Randy Terbush
2010-03-23 22:16     ` Dan Williams
2010-03-23 23:25       ` Randy Terbush
2010-03-24  0:23         ` Randy Terbush
2010-03-24  4:14           ` Randy Terbush
2010-03-24  5:54           ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).