* Minor mdadm fixes
@ 2010-01-11 20:38 Doug Ledford
2010-01-12 0:49 ` Mr. James W. Laferriere
2010-01-18 22:05 ` Neil Brown
0 siblings, 2 replies; 14+ messages in thread
From: Doug Ledford @ 2010-01-11 20:38 UTC (permalink / raw)
To: linux-raid
These are a number of minor fixes we carry in our mdadm at the moment. Would
prefer not to carry them ourselves ;-)
Neil, any clue when you think might release mdadm-3.1.2?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-01-11 20:38 Doug Ledford
@ 2010-01-12 0:49 ` Mr. James W. Laferriere
2010-01-12 3:10 ` Andre Noll
2010-01-18 22:05 ` Neil Brown
1 sibling, 1 reply; 14+ messages in thread
From: Mr. James W. Laferriere @ 2010-01-12 0:49 UTC (permalink / raw)
To: Doug Ledford; +Cc: linux-raid
Hello Doug ,
On Mon, 11 Jan 2010, Doug Ledford wrote:
> These are a number of minor fixes we carry in our mdadm at the moment. Would
> prefer not to carry them ourselves ;-)
>
> Neil, any clue when you think might release mdadm-3.1.2?
Would you please annotate the git diffs you sent out ?
For one I am very confused about the change of
- sprintf(path, "/var/run/mdadm/%s.pid", devname);
+ sprintf(path, "/dev/.mdadm/%s.pid", devname);
why ? For one .
None of the other patches has a annotation either . While if one is a
kernel coding guru they are probably very readable . I am not but would like to
know what the intention is for the change(s) .
Tia , JimL
--
+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network&System Engineer | 3237 Holden Road | Give me Linux |
| babydr@baby-dragons.com | Fairbanks, AK. 99709 | only on AXP |
+------------------------------------------------------------------+
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-01-12 0:49 ` Mr. James W. Laferriere
@ 2010-01-12 3:10 ` Andre Noll
2010-01-12 3:36 ` Doug Ledford
0 siblings, 1 reply; 14+ messages in thread
From: Andre Noll @ 2010-01-12 3:10 UTC (permalink / raw)
To: Mr. James W. Laferriere; +Cc: Doug Ledford, linux-raid
[-- Attachment #1: Type: text/plain, Size: 1463 bytes --]
On 15:49, Mr. James W. Laferriere wrote:
> Hello Doug ,
>
> On Mon, 11 Jan 2010, Doug Ledford wrote:
> >These are a number of minor fixes we carry in our mdadm at the moment.
> >Would
> >prefer not to carry them ourselves ;-)
> >
> >Neil, any clue when you think might release mdadm-3.1.2?
> Would you please annotate the git diffs you sent out ?
> For one I am very confused about the change of
>
> - sprintf(path, "/var/run/mdadm/%s.pid", devname);
> + sprintf(path, "/dev/.mdadm/%s.pid", devname);
>
> why ? For one .
>
> None of the other patches has a annotation either . While if one is
> a kernel coding guru they are probably very readable . I am not but would
> like to know what the intention is for the change(s) .
The annotation is the subject line :)
Before switching root it is sometimes desirable to mount --bind (or
--move) the current /dev into the new root as the programs started
from the new root will need the device nodes. For example the init
scripts that run from the intramfs move /dev after it has been mounted
as a ramfs and populated.
I believe (Doug, please correct me if the following is wrong) the
advantage of storing mdmon's pid files in /dev is that in /dev they
remain visible after the switch to the new root. That's the "handoff
after pivotroot" in the subject.
Regards
Andre
--
The only person who always got his work done by Friday was Robinson Crusoe
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-01-12 3:10 ` Andre Noll
@ 2010-01-12 3:36 ` Doug Ledford
2010-01-12 4:39 ` Andre Noll
0 siblings, 1 reply; 14+ messages in thread
From: Doug Ledford @ 2010-01-12 3:36 UTC (permalink / raw)
To: Andre Noll; +Cc: Mr. James W. Laferriere, linux-raid
[-- Attachment #1: Type: text/plain, Size: 3021 bytes --]
On 01/11/2010 10:10 PM, Andre Noll wrote:
> On 15:49, Mr. James W. Laferriere wrote:
>> Hello Doug ,
>>
>> On Mon, 11 Jan 2010, Doug Ledford wrote:
>>> These are a number of minor fixes we carry in our mdadm at the moment.
>>> Would
>>> prefer not to carry them ourselves ;-)
>>>
>>> Neil, any clue when you think might release mdadm-3.1.2?
>> Would you please annotate the git diffs you sent out ?
>> For one I am very confused about the change of
>>
>> - sprintf(path, "/var/run/mdadm/%s.pid", devname);
>> + sprintf(path, "/dev/.mdadm/%s.pid", devname);
>>
>> why ? For one .
>>
>> None of the other patches has a annotation either . While if one is
>> a kernel coding guru they are probably very readable . I am not but would
>> like to know what the intention is for the change(s) .
>
> The annotation is the subject line :)
>
> Before switching root it is sometimes desirable to mount --bind (or
> --move) the current /dev into the new root as the programs started
> from the new root will need the device nodes. For example the init
> scripts that run from the intramfs move /dev after it has been mounted
> as a ramfs and populated.
>
> I believe (Doug, please correct me if the following is wrong) the
> advantage of storing mdmon's pid files in /dev is that in /dev they
> remain visible after the switch to the new root. That's the "handoff
> after pivotroot" in the subject.
I'm a little fuzzy on this myself. The original patch was from another
Red Hatter that works on dracut, the new mkinitrd replacement in Fedora
12. When integrating IMSM support in the md raid stack and dracut,
there became a problem with starting mdmon in the initramfs filesystem
and then transitioning it to the new filesystem. It turns out that, as
you point out, because /dev is moved from the initramfs to the new root
(mainly because udev is now started in the initramfs), we could avoid a
number of issues caused by mdmon's files being in /var/run instead of
/dev. This also allowed us to do the mdmon restart *after* the
switchroot had taken place and solved a number of issues with getting
mdmon support to work. Like I said, I'm a bit fuzzy on the details
myself because it was Hans de Goede that was doing this work, not me,
and I just vaguely recall the problems this patch solved. But, it's not
surprising given that we already did a similar patch to move the
mdadm.map file from /var/run/mdadm to /dev simply because during the
very early boot stages after you have done the switch root, /var/run is
usually read only while /dev/ is read write and incremental assembly
won't work properly when it can't write to the mdadm.map file. So,
really this is in the same vein although the specific issues are different.
--
Doug Ledford <dledford@redhat.com>
GPG KeyID: CFBFF194
http://people.redhat.com/dledford
Infiniband specific RPMs available at
http://people.redhat.com/dledford/Infiniband
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-01-12 3:36 ` Doug Ledford
@ 2010-01-12 4:39 ` Andre Noll
2010-01-12 4:46 ` Doug Ledford
0 siblings, 1 reply; 14+ messages in thread
From: Andre Noll @ 2010-01-12 4:39 UTC (permalink / raw)
To: Doug Ledford; +Cc: Mr. James W. Laferriere, linux-raid
[-- Attachment #1: Type: text/plain, Size: 1367 bytes --]
On 22:36, Doug Ledford wrote:
> > I believe (Doug, please correct me if the following is wrong) the
> > advantage of storing mdmon's pid files in /dev is that in /dev they
> > remain visible after the switch to the new root. That's the "handoff
> > after pivotroot" in the subject.
>
> I'm a little fuzzy on this myself. The original patch was from another
> Red Hatter that works on dracut, the new mkinitrd replacement in Fedora
> 12. When integrating IMSM support in the md raid stack and dracut,
> there became a problem with starting mdmon in the initramfs filesystem
> and then transitioning it to the new filesystem. It turns out that, as
> you point out, because /dev is moved from the initramfs to the new root
> (mainly because udev is now started in the initramfs), we could avoid a
> number of issues caused by mdmon's files being in /var/run instead of
> /dev.
One could also mount /var/run in the initramfs and move it over
like /dev, but that gets a bit messy because /var might be on a yet
another fs..
> This also allowed us to do the mdmon restart *after* the switchroot
> had taken place and solved a number of issues with getting mdmon
> support to work.
Just out of interest: Why does mdmon need the restart at all?
Thanks
Andre
--
The only person who always got his work done by Friday was Robinson Crusoe
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-01-12 4:39 ` Andre Noll
@ 2010-01-12 4:46 ` Doug Ledford
2010-01-12 5:21 ` Andre Noll
0 siblings, 1 reply; 14+ messages in thread
From: Doug Ledford @ 2010-01-12 4:46 UTC (permalink / raw)
To: Andre Noll; +Cc: Mr. James W. Laferriere, linux-raid
[-- Attachment #1: Type: text/plain, Size: 1944 bytes --]
On 01/11/2010 11:39 PM, Andre Noll wrote:
> On 22:36, Doug Ledford wrote:
>>> I believe (Doug, please correct me if the following is wrong) the
>>> advantage of storing mdmon's pid files in /dev is that in /dev they
>>> remain visible after the switch to the new root. That's the "handoff
>>> after pivotroot" in the subject.
>>
>> I'm a little fuzzy on this myself. The original patch was from another
>> Red Hatter that works on dracut, the new mkinitrd replacement in Fedora
>> 12. When integrating IMSM support in the md raid stack and dracut,
>> there became a problem with starting mdmon in the initramfs filesystem
>> and then transitioning it to the new filesystem. It turns out that, as
>> you point out, because /dev is moved from the initramfs to the new root
>> (mainly because udev is now started in the initramfs), we could avoid a
>> number of issues caused by mdmon's files being in /var/run instead of
>> /dev.
>
> One could also mount /var/run in the initramfs and move it over
> like /dev, but that gets a bit messy because /var might be on a yet
> another fs..
>
>> This also allowed us to do the mdmon restart *after* the switchroot
>> had taken place and solved a number of issues with getting mdmon
>> support to work.
>
> Just out of interest: Why does mdmon need the restart at all?
Because of the design and limitation of page cache. If you start a long
running program from the initramfs, then the page cache usage will pin
the initramfs in memory. You need to reexec the program from the hard
drive so that the initramfs can be freed. Of course, this is all
because superblock handling for imsm superblocks is in user
space...<grumble, grumble, grumble>.
--
Doug Ledford <dledford@redhat.com>
GPG KeyID: CFBFF194
http://people.redhat.com/dledford
Infiniband specific RPMs available at
http://people.redhat.com/dledford/Infiniband
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-01-12 4:46 ` Doug Ledford
@ 2010-01-12 5:21 ` Andre Noll
0 siblings, 0 replies; 14+ messages in thread
From: Andre Noll @ 2010-01-12 5:21 UTC (permalink / raw)
To: Doug Ledford; +Cc: Mr. James W. Laferriere, linux-raid
[-- Attachment #1: Type: text/plain, Size: 659 bytes --]
On 23:46, Doug Ledford wrote:
> > Just out of interest: Why does mdmon need the restart at all?
>
> Because of the design and limitation of page cache. If you start a long
> running program from the initramfs, then the page cache usage will pin
> the initramfs in memory. You need to reexec the program from the hard
> drive so that the initramfs can be freed. Of course, this is all
> because superblock handling for imsm superblocks is in user
> space...<grumble, grumble, grumble>.
You could copy the mdmon executable to /dev and execute it there :)
Andre
--
The only person who always got his work done by Friday was Robinson Crusoe
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-01-11 20:38 Doug Ledford
2010-01-12 0:49 ` Mr. James W. Laferriere
@ 2010-01-18 22:05 ` Neil Brown
1 sibling, 0 replies; 14+ messages in thread
From: Neil Brown @ 2010-01-18 22:05 UTC (permalink / raw)
To: Doug Ledford; +Cc: linux-raid
On Mon, 11 Jan 2010 15:38:09 -0500
Doug Ledford <dledford@redhat.com> wrote:
> These are a number of minor fixes we carry in our mdadm at the moment. Would
> prefer not to carry them ourselves ;-)
Fair enough - I've taken three of them...
>
> Neil, any clue when you think might release mdadm-3.1.2?
Maybe in February. Feel free to remind me if I haven't done anything by
about the 14th.
Thanks,
NeilBrown
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Minor mdadm fixes
@ 2010-03-19 2:13 Doug Ledford
2010-03-19 7:01 ` Luca Berra
2010-03-24 0:27 ` Neil Brown
0 siblings, 2 replies; 14+ messages in thread
From: Doug Ledford @ 2010-03-19 2:13 UTC (permalink / raw)
To: Linux RAID Mailing List, Neil Brown
[-- Attachment #1.1: Type: text/plain, Size: 3591 bytes --]
I have several minor fixes against the 3.1.2 mdadm.
First attachment: mdadm-3.1.1-endian.patch
Problem: on ppc arches mdadm would fail to assemble arrays if you used
mdadm -Db /dev/<array> to populate the ARRAY line in mdadm.conf. This
is because the code in Detail.c calls fname_from_uuid() which honors the
swapuuid setting in the superblock switch struct. This is great for ddf
and imsm superblocks which call fname_from_uuid() all over the place and
are therefore consistent. However, super1.c does *not* call
fname_from_uuid()...ever. And super1.c does *not* perform the byte
swapping as per swapuuid in the switch struct. At least not on
*display*, but it *does* honor swapuuid on actual analysis of the uuid
in terms of figuring out matches. So, if you attempt to change the
value of swapuuid in the super1 switch struct, then uuid's never match
what's printed out (aka, the output of -D and -E both fail to match
against the detected uuid when assembling, although at least -D and -E
are now consistent). Instead, I did a simple (and gross) hack to ignore
swapuuid in fname_from_uuid() only for version 1 superblocks. The other
alternative might be to make super1.c call fname_from_uuid() all over
the place instead of printing things out itself, but that would involve
struct changes as the way it stores/accesses uuids in super1 is not per
char like fname_from_uuid and that's precisely part of the problem.
Second: mdadm-3.1.2-mapfile.patch
Problem: Neil's support for putting the mdadm map file wherever you need
it is nice, but one place in particular needs a special case.
Specifically, mdadm already creates /dev/md if needed to store symlinks,
or in the case of mdmon needing to create pid/sock files (if ALT_RUN is
set to /dev/md). This creates an expectation in udev and friends that
mdadm will create /dev/md when needed, period. We need it if we place
the mapfile in /dev/md prior to any symlinks or mdmon files being
created, so special case that one location in order to make us
consistent with both mdmon and mdopen. If we don't, we will fail to
create our mapfile.
Third: mdadm-3.1.2-rebuild.patch
Problem: The above issue pointed out a different issue. Specifically,
if we can't open our mapfile, then we call RebuildMap, and at the end of
the rebuild, we trigger a new change event on the raid device. Well, if
the write of the new map file failed because the directory we wanted to
write into didn't exist, this creates an infinite loop where udev calls
mdadm, mdadm fails to open file, mdadm rebuils map, mdadm fails to write
new map, but mdadm triggers a change event to cause udev to rerun mdadm
on the belief that the map file *was* updated. So, if we fail to write
the new mapfile, don't signal a change event, let the situation drop.
This avoids the infinite loop.
Fourth: mdadm-3.1.2-mapname.patch
Feature: If we are able to easily select the location of the mapfile via
the use of ALT_RUN at compile time, it makes sense to also be able to
set the filename. This way you can make it something obvious such as
/dev/md + md-device-map which makes the file name both unlikely to
conflict with a device name (where as I could see map conflicting in
certain specialized scenarios, like the array holding map data at the
local governmental office) and obvious in purpose.
--
Doug Ledford <dledford@redhat.com>
GPG KeyID: CFBFF194
http://people.redhat.com/dledford
Infiniband specific RPMs available at
http://people.redhat.com/dledford/Infiniband
[-- Attachment #1.2: mdadm-3.1.1-endian.patch --]
[-- Type: text/plain, Size: 807 bytes --]
--- mdadm-3.1.1/util.c.endian 2010-02-21 19:13:56.253610477 -0500
+++ mdadm-3.1.1/util.c 2010-02-21 19:16:26.338375501 -0500
@@ -395,7 +395,12 @@ char *__fname_from_uuid(int id[4], int s
char *fname_from_uuid(struct supertype *st, struct mdinfo *info, char *buf, char sep)
{
- return __fname_from_uuid(info->uuid, st->ss->swapuuid, buf, sep);
+ // dirty hack to work around an issue with super1 superblocks...
+ // super1 superblocks need swapuuid set in order for assembly to
+ // work, but can't have it set if we want this printout to match
+ // all the other uuid printouts in super1.c, so we force swapuuid
+ // to 1 to make our printout match the rest of super1
+ return __fname_from_uuid(info->uuid, (st->ss == &super1) ? 1 : st->ss->swapuuid, buf, sep);
}
#ifndef MDASSEMBLE
[-- Attachment #1.3: mdadm-3.1.2-mapfile.patch --]
[-- Type: text/plain, Size: 1490 bytes --]
commit b25462c1bee1a01c6aae3c215e040bfb2f1c2fb7
Author: Doug Ledford <dledford@redhat.com>
Date: Tue Mar 16 23:00:11 2010 -0400
Create /dev/md in mapfile open like we do in mdopen if our ALT_RUN is
set to be /dev/md. This keeps udev happy as it won't have to special
case our /dev/md directory needs.
Signed-off-by: Doug Ledford <dledford@redhat.com>
diff --git a/mapfile.c b/mapfile.c
index 366ebe3..eed17c8 100644
--- a/mapfile.c
+++ b/mapfile.c
@@ -29,7 +29,7 @@
*/
/* /var/run/mdadm.map is used to track arrays being created in --incremental
- * more. It particularly allows lookup from UUID to array device, but
+ * mode. It particularly allows lookup from UUID to array device, but
* also allows the array device name to be easily found.
*
* The map file is line based with space separated fields. The fields are:
@@ -64,6 +64,16 @@ char *mapsmode[3] = { "r", "w", "w"};
FILE *open_map(int modenum, int *choice)
{
int i;
+ struct stat sbuf;
+
+ /* Special case...if ALT_RUN is selected to be /dev/md, then
+ * because we would normally create /dev/md ourselves in order to
+ * stuff array symlinks in there as needed, udev and friends
+ * expect us to create our own tree. So, do so.
+ */
+ if (strcmp(ALT_RUN, "/dev/md") == 0 && stat(ALT_RUN, &sbuf) != 0)
+ mkdir(ALT_RUN, 0755);
+
for (i = 0 ; i < 3 ; i++) {
int fd = open(mapname[i][modenum], mapmode[modenum], 0600);
if (fd >= 0) {
[-- Attachment #1.4: mdadm-3.1.2-rebuild.patch --]
[-- Type: text/plain, Size: 956 bytes --]
commit 0d196a9dcab2e4eb0bce49b63c971d1e06cd9300
Author: Doug Ledford <dledford@redhat.com>
Date: Wed Mar 17 09:28:07 2010 -0400
Only signal a udev change event if we actually write a mapfile in RebuildMap
Signed-off-by: Doug Ledford <dledford@redhat.com>
diff --git a/mapfile.c b/mapfile.c
index eed17c8..950cf62 100644
--- a/mapfile.c
+++ b/mapfile.c
@@ -472,12 +472,14 @@ void RebuildMap(void)
}
sysfs_free(sra);
}
- map_write(map);
+ /* Only trigger a change if we wrote a new map file */
+ if (map_write(map))
+ for (md = mdstat ; md ; md = md->next) {
+ struct mdinfo *sra = sysfs_read(-1, md->devnum,
+ GET_VERSION);
+ sysfs_uevent(sra, "change");
+ sysfs_free(sra);
+ }
map_free(map);
- for (md = mdstat ; md ; md = md->next) {
- struct mdinfo *sra = sysfs_read(-1, md->devnum, GET_VERSION);
- sysfs_uevent(sra, "change");
- sysfs_free(sra);
- }
free_mdstat(mdstat);
}
[-- Attachment #1.5: mdadm-3.1.2-mapname.patch --]
[-- Type: text/plain, Size: 1278 bytes --]
commit 9649a16e4c0e679c0e5bc3e33d816943b85dd910
Author: Doug Ledford <dledford@redhat.com>
Date: Wed Mar 17 10:52:22 2010 -0400
mapfile: if we putting the mapfile in a custom location via ALT_RUN, allow
a custom filename too.
Signed-off-by: Doug Ledford <dledford@redhat.com>
diff --git a/Makefile b/Makefile
index 1035ea8..2aafad0 100644
--- a/Makefile
+++ b/Makefile
@@ -63,8 +63,9 @@ CONFFILEFLAGS = -DCONFFILE=\"$(CONFFILE)\" -DCONFFILE2=\"$(CONFFILE2)\"
# If you don't have /lib/init/rw you might want to use /dev/.something
# e.g. make ALT_RUN=/dev/.mdadm
ALT_RUN = /lib/init/rw
+ALT_MAPFILE = map
VAR_RUN = /var/run
-ALTFLAGS = -DALT_RUN=\"$(ALT_RUN)\"
+ALTFLAGS = -DALT_RUN=\"$(ALT_RUN)\" -DALT_MAPFILE=\"$(ALT_MAPFILE)\"
VARFLAGS = -DVAR_RUN=\"$(VAR_RUN)\"
CFLAGS = $(CWFLAGS) $(CXFLAGS) -DSendmail=\""$(MAILCMD)"\" $(CONFFILEFLAGS) $(ALTFLAGS) $(VARFLAGS)
diff --git a/mapfile.c b/mapfile.c
index 950cf62..51d2b14 100644
--- a/mapfile.c
+++ b/mapfile.c
@@ -55,7 +55,7 @@
char *mapname[3][3] = {
mapnames(VAR_RUN "/map"),
mapnames("/var/run/mdadm.map"),
- mapnames(ALT_RUN "/map")
+ mapnames(ALT_RUN "/" ALT_MAPFILE)
};
int mapmode[3] = { O_RDONLY, O_RDWR|O_CREAT, O_RDWR|O_CREAT | O_TRUNC };
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-03-19 2:13 Minor mdadm fixes Doug Ledford
@ 2010-03-19 7:01 ` Luca Berra
2010-03-23 19:20 ` Doug Ledford
2010-03-24 0:27 ` Neil Brown
1 sibling, 1 reply; 14+ messages in thread
From: Luca Berra @ 2010-03-19 7:01 UTC (permalink / raw)
To: Linux RAID Mailing List
On Thu, Mar 18, 2010 at 10:13:06PM -0400, Doug Ledford wrote:
>Second: mdadm-3.1.2-mapfile.patch
>Problem: Neil's support for putting the mdadm map file wherever you need
>it is nice, but one place in particular needs a special case.
>Specifically, mdadm already creates /dev/md if needed to store symlinks,
>or in the case of mdmon needing to create pid/sock files (if ALT_RUN is
why on earth do you want to set ALT_RUN=/dev/md ?
it is messy enough as it is, let's keep assuming /dev/md contains only
array device files.
besides that you will discover that the only way to make mdmon work is
setting VAR_RUN and ALT_RUN to the same value, something we can
guarantee to be writable even in the direst situation, and i'd keep
/dev/.mdadm
>Fourth: mdadm-3.1.2-mapname.patch
>Feature: If we are able to easily select the location of the mapfile via
>the use of ALT_RUN at compile time, it makes sense to also be able to
this "hunt the mapfile" thing is already ugly as it is, why do we need
an ALT_MAPNAME? if there was any value (except for allowing to set
ALT_RUN to /dev/md) to having the mapfile named differently, just force
it to be called mdadm.map everywere (or incremental.map so the name
reflects the need for the file existance)
L.
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-03-19 7:01 ` Luca Berra
@ 2010-03-23 19:20 ` Doug Ledford
2010-03-23 20:28 ` Luca Berra
0 siblings, 1 reply; 14+ messages in thread
From: Doug Ledford @ 2010-03-23 19:20 UTC (permalink / raw)
To: Linux RAID Mailing List
[-- Attachment #1: Type: text/plain, Size: 2913 bytes --]
On 03/19/2010 03:01 AM, Luca Berra wrote:
> On Thu, Mar 18, 2010 at 10:13:06PM -0400, Doug Ledford wrote:
>> Second: mdadm-3.1.2-mapfile.patch
>> Problem: Neil's support for putting the mdadm map file wherever you need
>> it is nice, but one place in particular needs a special case.
>> Specifically, mdadm already creates /dev/md if needed to store symlinks,
>> or in the case of mdmon needing to create pid/sock files (if ALT_RUN is
> why on earth do you want to set ALT_RUN=/dev/md ?
Because on Fedora /lib/init/rw doesn't exist, and neither does any other
early rw area other than /dev, and it's too late to create a new one, so
I need to use someplace in /dev. However, I agree with Neil that if you
are hiding things under . names then you are admitting something isn't
right. So, I'm perfectly happy putting the map file in /dev/md as I
think it has a perfect right to exist there.
> it is messy enough as it is, let's keep assuming /dev/md contains only
> array device files.
I don't know how many arrays you have on your system, but I would hardly
call /dev/md messy.
> besides that you will discover that the only way to make mdmon work is
> setting VAR_RUN and ALT_RUN to the same value,
[ citation needed ]
> something we can
> guarantee to be writable even in the direst situation, and i'd keep
> /dev/.mdadm
There is no guarantee /dev/.mdadm exists on a clean install, where as I
have made sure that all possible places that might need /dev/md will
create it if it doesn't exist. Hence, the direst of situations is
handled better with /dev/md than with /dev/.mdadm
>> Fourth: mdadm-3.1.2-mapname.patch
>> Feature: If we are able to easily select the location of the mapfile via
>> the use of ALT_RUN at compile time, it makes sense to also be able to
> this "hunt the mapfile" thing is already ugly as it is, why do we need
> an ALT_MAPNAME?
Because /dev/md/map is too easy a name to conflict with a legitimate md
array device name, and that's what it defaults to if you set ALT_RUN to
/dev/md.
> if there was any value (except for allowing to set
> ALT_RUN to /dev/md) to having the mapfile named differently, just force
> it to be called mdadm.map everywere (or incremental.map so the name
> reflects the need for the file existance)
Except I set it to md-device-map. Aside from not conflicting with md
device names, the map file name is mostly irrelevant except to indicate
to people perusing the directory what the file is for as nothing uses it
besides mdadm and mdadm will gladly use it with whatever name you call
it as it both creates and uses the file from the same compile time
configuration setting.
--
Doug Ledford <dledford@redhat.com>
GPG KeyID: CFBFF194
http://people.redhat.com/dledford
Infiniband specific RPMs available at
http://people.redhat.com/dledford/Infiniband
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-03-23 19:20 ` Doug Ledford
@ 2010-03-23 20:28 ` Luca Berra
0 siblings, 0 replies; 14+ messages in thread
From: Luca Berra @ 2010-03-23 20:28 UTC (permalink / raw)
To: Linux RAID Mailing List
On Tue, Mar 23, 2010 at 03:20:10PM -0400, Doug Ledford wrote:
>On 03/19/2010 03:01 AM, Luca Berra wrote:
>> On Thu, Mar 18, 2010 at 10:13:06PM -0400, Doug Ledford wrote:
>>> Second: mdadm-3.1.2-mapfile.patch
>>> Problem: Neil's support for putting the mdadm map file wherever you need
>>> it is nice, but one place in particular needs a special case.
>>> Specifically, mdadm already creates /dev/md if needed to store symlinks,
>>> or in the case of mdmon needing to create pid/sock files (if ALT_RUN is
>> why on earth do you want to set ALT_RUN=/dev/md ?
>
>Because on Fedora /lib/init/rw doesn't exist, and neither does any other
>early rw area other than /dev, and it's too late to create a new one, so
Neither in Mandriva for that matter.
>I need to use someplace in /dev. However, I agree with Neil that if you
>are hiding things under . names then you are admitting something isn't
>right. So, I'm perfectly happy putting the map file in /dev/md as I
>think it has a perfect right to exist there.
besides the fact that it is perfectly clear that something is wrong, i
am not vehemently opposing that.
>> it is messy enough as it is, let's keep assuming /dev/md contains only
>> array device files.
>
>I don't know how many arrays you have on your system, but I would hardly
>call /dev/md messy.
I did not mean /dev/md is messy, I meant the code for locating the map
file is messy, and the code for handling the mdmon pid and socket is
messy....
for that matter on my test rig i have quite a few file in there
/dev/md/imsm0
/dev/md/Volume0
/dev/md/Volume0p1
....
/dev/md/Volume1
/dev/md/Volume1p1
...
>> besides that you will discover that the only way to make mdmon work is
>> setting VAR_RUN and ALT_RUN to the same value,
>
>[ citation needed ]
two issues:
1) see the thread on this ml starting with:
[mdadm PATCH 5/9] mdmon: mdmon_pid should return pid from either dir
in practice mdadm looks for the mdmon pid and socket in VAR_RUN, but
until takeover the pid and socket are in ALT_RUN, this means you cannot
do array maintainance until /var is mounted rw.
(been there, done that, had to mount a tmpfs on /var and mount --bind
/dev/.mdadm /var/run/mdadm to fix the array)
Neil refused that patch, but allowed redefinition of VAR_RUN, and i did
not insist.
2) having to keep /var/run mounted during shutdown is a PITA, and iirc
you (redhat) were opposed to having mdmon files in /var/run/mdadm, last
time the issue was discussed here.
>> something we can
>> guarantee to be writable even in the direst situation, and i'd keep
>> /dev/.mdadm
>
>There is no guarantee /dev/.mdadm exists on a clean install, where as I
>have made sure that all possible places that might need /dev/md will
>create it if it doesn't exist. Hence, the direst of situations is
>handled better with /dev/md than with /dev/.mdadm
yes, in mdmon.c ALT_RUN is created before usage, in mapfile.c it is not,
adding an mkdir there would be useful, independent of what we set
ALT_RUN to.
>>> Fourth: mdadm-3.1.2-mapname.patch
>>> Feature: If we are able to easily select the location of the mapfile via
>>> the use of ALT_RUN at compile time, it makes sense to also be able to
>> this "hunt the mapfile" thing is already ugly as it is, why do we need
>> an ALT_MAPNAME?
>
>Because /dev/md/map is too easy a name to conflict with a legitimate md
>array device name, and that's what it defaults to if you set ALT_RUN to
>/dev/md.
>
>> if there was any value (except for allowing to set
>> ALT_RUN to /dev/md) to having the mapfile named differently, just force
>> it to be called mdadm.map everywere (or incremental.map so the name
>> reflects the need for the file existance)
>
>Except I set it to md-device-map. Aside from not conflicting with md
>device names, the map file name is mostly irrelevant except to indicate
>to people perusing the directory what the file is for as nothing uses it
>besides mdadm and mdadm will gladly use it with whatever name you call
>it as it both creates and uses the file from the same compile time
>configuration setting.
I know all this, what i meant is:
why do you feel there is a need for ALT_MAPNAME, would not it be easier
to just define a MAPFILE_NAME with a fixed path and avoid the open_map
juggling in mapfile.c, there is no need for that code, since, as you
state: "mdadm both creates and uses the file from the same compile time
configuration setting."
L.
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-03-19 2:13 Minor mdadm fixes Doug Ledford
2010-03-19 7:01 ` Luca Berra
@ 2010-03-24 0:27 ` Neil Brown
2010-03-24 17:48 ` Doug Ledford
1 sibling, 1 reply; 14+ messages in thread
From: Neil Brown @ 2010-03-24 0:27 UTC (permalink / raw)
To: Doug Ledford; +Cc: Linux RAID Mailing List
On Thu, 18 Mar 2010 22:13:06 -0400
Doug Ledford <dledford@redhat.com> wrote:
> I have several minor fixes against the 3.1.2 mdadm.
Thanks!
>
> First attachment: mdadm-3.1.1-endian.patch
> Problem: on ppc arches mdadm would fail to assemble arrays if you used
> mdadm -Db /dev/<array> to populate the ARRAY line in mdadm.conf. This
> is because the code in Detail.c calls fname_from_uuid() which honors the
> swapuuid setting in the superblock switch struct. This is great for ddf
> and imsm superblocks which call fname_from_uuid() all over the place and
> are therefore consistent. However, super1.c does *not* call
> fname_from_uuid()...ever. And super1.c does *not* perform the byte
> swapping as per swapuuid in the switch struct. At least not on
> *display*, but it *does* honor swapuuid on actual analysis of the uuid
> in terms of figuring out matches. So, if you attempt to change the
> value of swapuuid in the super1 switch struct, then uuid's never match
> what's printed out (aka, the output of -D and -E both fail to match
> against the detected uuid when assembling, although at least -D and -E
> are now consistent). Instead, I did a simple (and gross) hack to ignore
> swapuuid in fname_from_uuid() only for version 1 superblocks. The other
> alternative might be to make super1.c call fname_from_uuid() all over
> the place instead of printing things out itself, but that would involve
> struct changes as the way it stores/accesses uuids in super1 is not per
> char like fname_from_uuid and that's precisely part of the problem.
Ohh, that is horrible isn't it.
I have applied your patch but I very much hope to "fix" the problem in a more
comprehensive way so that your fix will disappear.
I suspect that I will change the common code to use an array of bytes and do
any conversion from u32[4] to u8[1] in the per-metadata code.
>
> Second: mdadm-3.1.2-mapfile.patch
> Problem: Neil's support for putting the mdadm map file wherever you need
> it is nice, but one place in particular needs a special case.
> Specifically, mdadm already creates /dev/md if needed to store symlinks,
> or in the case of mdmon needing to create pid/sock files (if ALT_RUN is
> set to /dev/md). This creates an expectation in udev and friends that
> mdadm will create /dev/md when needed, period. We need it if we place
> the mapfile in /dev/md prior to any symlinks or mdmon files being
> created, so special case that one location in order to make us
> consistent with both mdmon and mdopen. If we don't, we will fail to
> create our mapfile.
I thought udev always created the directory if it was needed, and always
removed it if it became empty after deleting an entry. But I have no strong
basis for believing that.
However I do think that mdadm should create the directory in this case - not
any parents, but if they exist and are writeable, then create the directory.
So I modified you patch to do that when creating the mapfile (not when
reading it).
>
> Third: mdadm-3.1.2-rebuild.patch
> Problem: The above issue pointed out a different issue. Specifically,
> if we can't open our mapfile, then we call RebuildMap, and at the end of
> the rebuild, we trigger a new change event on the raid device. Well, if
> the write of the new map file failed because the directory we wanted to
> write into didn't exist, this creates an infinite loop where udev calls
> mdadm, mdadm fails to open file, mdadm rebuils map, mdadm fails to write
> new map, but mdadm triggers a change event to cause udev to rerun mdadm
> on the belief that the map file *was* updated. So, if we fail to write
> the new mapfile, don't signal a change event, let the situation drop.
> This avoids the infinite loop.
Very appropriate - thanks.
>
> Fourth: mdadm-3.1.2-mapname.patch
> Feature: If we are able to easily select the location of the mapfile via
> the use of ALT_RUN at compile time, it makes sense to also be able to
> set the filename. This way you can make it something obvious such as
> /dev/md + md-device-map which makes the file name both unlikely to
> conflict with a device name (where as I could see map conflicting in
> certain specialized scenarios, like the array holding map data at the
> local governmental office) and obvious in purpose.
>
>
Seems reasonable. I've included that patch too.
Thanks a lot,
NeilBrown
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Minor mdadm fixes
2010-03-24 0:27 ` Neil Brown
@ 2010-03-24 17:48 ` Doug Ledford
0 siblings, 0 replies; 14+ messages in thread
From: Doug Ledford @ 2010-03-24 17:48 UTC (permalink / raw)
To: Neil Brown; +Cc: Linux RAID Mailing List
[-- Attachment #1: Type: text/plain, Size: 1370 bytes --]
On 03/23/2010 08:27 PM, Neil Brown wrote:
> I thought udev always created the directory if it was needed, and always
> removed it if it became empty after deleting an entry. But I have no strong
> basis for believing that.
I don't know about the removing of a directory, but it will only create
a directory if we indicate it should place a symlink in it (speaking of
/dev/md/). And we only do that if mdadm -D --export prints out
MD_DEVNAME, and we only print that out if the name field of the
superblock includes /dev/md/. So, if the name is something shorthand,
like host:0 or host:md0, we never print out MD_DEVNAME and so udev
doesn't create the symlinks in /dev/md.
But, regardless of that, we need the mapfile prior to devices being
brought up, so prior to any possible directory creation by udev. Hence
we must take care of the issue ourselves.
> However I do think that mdadm should create the directory in this case - not
> any parents, but if they exist and are writeable, then create the directory.
>
> So I modified you patch to do that when creating the mapfile (not when
> reading it).
Makes sense, thanks.
--
Doug Ledford <dledford@redhat.com>
GPG KeyID: CFBFF194
http://people.redhat.com/dledford
Infiniband specific RPMs available at
http://people.redhat.com/dledford/Infiniband
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-03-24 17:48 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-19 2:13 Minor mdadm fixes Doug Ledford
2010-03-19 7:01 ` Luca Berra
2010-03-23 19:20 ` Doug Ledford
2010-03-23 20:28 ` Luca Berra
2010-03-24 0:27 ` Neil Brown
2010-03-24 17:48 ` Doug Ledford
-- strict thread matches above, loose matches on Subject: below --
2010-01-11 20:38 Doug Ledford
2010-01-12 0:49 ` Mr. James W. Laferriere
2010-01-12 3:10 ` Andre Noll
2010-01-12 3:36 ` Doug Ledford
2010-01-12 4:39 ` Andre Noll
2010-01-12 4:46 ` Doug Ledford
2010-01-12 5:21 ` Andre Noll
2010-01-18 22:05 ` Neil Brown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).