* [RFC PATCH 0/3] mdraid rootfs support
@ 2009-02-05 22:49 Dan Williams
[not found] ` <20090205224808.18610.14957.stgit-p8uTFz9XbKjBPTuBivz2/GFmcEqAMTzPQQ4Iyu8u01E@public.gmane.org>
0 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2009-02-05 22:49 UTC (permalink / raw)
To: initramfs-u79uwXL29TY76Z2rM5mHXA
Cc: neilb-l3A5Bk7waGM, jacek.danecki-ral2JQCrhuEAvxtiuMwx3w
This series is a first take at dracut support for an mdraid rootfs. It
includes considerations for the new external metadata formats supported in the
latest development branch of mdadm:
git://neil.brown.name/mdadm devel-3.0
This is an RFC because it is not clear to me that a single call to "udevadm
settle" is enough to guarantee discovery of all storage devices ahead of raid
assembly. A cursory test with (4) ahci attached drives was successful.
Regards,
Dan
---
Dan Williams (3):
add more disk id helpers to udevexe
raid: external and internal metadata support
gen-mod-lists: create lists of modules that may talk to a root device
dracut | 13 ++++++++++---
gen-mod-lists | 34 ++++++++++++++++++++++++++++++++++
init | 10 ++++++++++
3 files changed, 54 insertions(+), 3 deletions(-)
create mode 100755 gen-mod-lists
--
To unsubscribe from this list: send the line "unsubscribe initramfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 24+ messages in thread[parent not found: <20090205224808.18610.14957.stgit-p8uTFz9XbKjBPTuBivz2/GFmcEqAMTzPQQ4Iyu8u01E@public.gmane.org>]
* [RFC PATCH 1/3] gen-mod-lists: create lists of modules that may talk to a root device [not found] ` <20090205224808.18610.14957.stgit-p8uTFz9XbKjBPTuBivz2/GFmcEqAMTzPQQ4Iyu8u01E@public.gmane.org> @ 2009-02-05 22:49 ` Dan Williams 2009-02-05 22:49 ` [RFC PATCH 2/3] raid: external and internal metadata support Dan Williams 2009-02-05 22:49 ` [RFC PATCH 3/3] add more disk id helpers to udevexe Dan Williams 2 siblings, 0 replies; 24+ messages in thread From: Dan Williams @ 2009-02-05 22:49 UTC (permalink / raw) To: initramfs-u79uwXL29TY76Z2rM5mHXA Cc: neilb-l3A5Bk7waGM, jacek.danecki-ral2JQCrhuEAvxtiuMwx3w notting-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org: The idea is that we don't want to include every single module, but we want to include every module that might define a block device to boot from, or a network device to network boot from. Having it in the upstream kernel would be helpful, although how it's generated now is obviously a hack. Doing it at runtime in dracut would work, but would be obviously slow. This is a temporary hack to duplicate this functionality from the Fedora kernel srpm in dracut. Also added "raid" modules. Signed-off-by: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> --- gen-mod-lists | 34 ++++++++++++++++++++++++++++++++++ 1 files changed, 34 insertions(+), 0 deletions(-) create mode 100755 gen-mod-lists diff --git a/gen-mod-lists b/gen-mod-lists new file mode 100755 index 0000000..13999d7 --- /dev/null +++ b/gen-mod-lists @@ -0,0 +1,34 @@ +#!/bin/bash + +# Copied from from kernel.spec (kernel-2.6.27.12-78.2.8.fc9.src.rpm) +# Creates /lib/modules/$KernelVer/modules.{block,networking,raid} + +KernelVer=$1 +[ -n $KernelVer ] && KernelVer=$(uname -r) + +if [ ! -d /lib/modules/$KernelVer ]; then + echo "error: could not find /lib/modules/$KernelVer" + exit 1 +fi + +find /lib/modules/$KernelVer -name "*.ko" -type f >modnames + +# Generate a list of modules for block and networking. + +fgrep /drivers/ modnames | xargs --no-run-if-empty nm -upA | +sed -n 's,^.*/\([^/]*\.ko\): *U \(.*\)$,\1 \2,p' > drivers.undef + +collect_modules_list() +{ + sed -r -n -e "s/^([^ ]+) \\.?($2)\$/\\1/p" drivers.undef | + LC_ALL=C sort -u > /lib/modules/$KernelVer/modules.$1 +} + +collect_modules_list networking \ + 'register_netdev|ieee80211_register_hw|usbnet_probe' +collect_modules_list block \ + 'ata_scsi_ioctl|scsi_add_host|blk_init_queue|register_mtd_blktrans|scsi_esp_register' + +# mdraid modules, could be made part of 'block' +collect_modules_list raid \ + 'register_md_personality' -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <20090205224808.18610.14957.stgit-p8uTFz9XbKjBPTuBivz2/GFmcEqAMTzPQQ4Iyu8u01E@public.gmane.org> 2009-02-05 22:49 ` [RFC PATCH 1/3] gen-mod-lists: create lists of modules that may talk to a root device Dan Williams @ 2009-02-05 22:49 ` Dan Williams [not found] ` <20090205224920.18610.63979.stgit-p8uTFz9XbKjBPTuBivz2/GFmcEqAMTzPQQ4Iyu8u01E@public.gmane.org> 2009-02-05 22:49 ` [RFC PATCH 3/3] add more disk id helpers to udevexe Dan Williams 2 siblings, 1 reply; 24+ messages in thread From: Dan Williams @ 2009-02-05 22:49 UTC (permalink / raw) To: initramfs-u79uwXL29TY76Z2rM5mHXA Cc: neilb-l3A5Bk7waGM, jacek.danecki-ral2JQCrhuEAvxtiuMwx3w External metadata support implies that metadata events are handled by a userspace daemon. This daemon, mdmon, needs to be started ahead of the rootfs being mounted to handle the raid volume dirty bit. Even if the rootfs is mounted read-only the rootdev may still be written by filesystem journal-playback operations. After the rootfs is mounted to /sysroot, mdmon is restarted in the new namespace. The command "mdmon /proc/mdstat /sysroot" tells mdmon to terminate any instances in the current namespace and then launch new instances, chroot(2) to /sysroot, per container device found in /proc/mdstat. Signed-off-by: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> --- dracut | 11 +++++++++-- init | 10 ++++++++++ 2 files changed, 19 insertions(+), 2 deletions(-) diff --git a/dracut b/dracut index ef2ca42..513b6c6 100755 --- a/dracut +++ b/dracut @@ -69,13 +69,14 @@ initdir=$(mktemp -d -t initramfs.XXXXXX) exe="/bin/bash /bin/mount /bin/mknod /bin/mkdir /sbin/modprobe /sbin/udevd /sbin/udevadm /sbin/nash /bin/kill /sbin/pidof /bin/sleep /bin/echo /usr/sbin/chroot" lvmexe="/sbin/lvm" cryptexe="/sbin/cryptsetup" +raidexe="/sbin/mdadm /sbin/mdmon" # and some things that are nice for debugging debugexe="/bin/ls /bin/cat /bin/ln /bin/ps /bin/grep /bin/more" # udev things we care about udevexe="/lib/udev/vol_id /lib/udev/console_init" # install base files -for binary in $exe $debugexe $udevexe $lvmexe $cryptexe ; do +for binary in $exe $debugexe $udevexe $lvmexe $cryptexe $raidexe ; do inst $binary $initdir done @@ -152,7 +153,7 @@ cp $switchroot $initdir/sbin/switch_root mkdir -p $initdir/etc $initdir/proc $initdir/sys $initdir/sysroot $initdir/dev/pts # FIXME: hard-coded module list of doom. -[ -z "$modules" ] && modules="=ata =block =drm dm-crypt aes sha256 cbc" +[ -z "$modules" ] && modules="=ata =block =drm =raid dm-crypt aes sha256 cbc" mkdir -p $initdir/lib/modules/$kernel # expand out module deps, etc @@ -171,6 +172,12 @@ if [ -x /usr/libexec/plymouth/plymouth-populate-initrd ]; then /usr/libexec/plymouth/plymouth-populate-initrd -t "$initdir" || : fi +# raid +# mdadm.conf allows mdadm to disambiguate foreign arrays for some metadata types +# check /etc and /etc/mdadm (/etc wins if both are present) +[ -f /etc/mdadm/mdadm.conf ] && inst /etc/mdadm/mdadm.conf "$initdir" /etc/mdadm.conf +[ -f /etc/mdadm.conf ] && inst /etc/mdadm.conf "$initdir" + pushd $initdir >/dev/null find . |cpio -H newc -o |gzip -9 > $outfile popd >/dev/null diff --git a/init b/init index 706127f..0294502 100755 --- a/init +++ b/init @@ -46,6 +46,13 @@ mknod /dev/tty1 c 4 1 /sbin/udevd --daemon /sbin/udevadm trigger +# start any defined raid arrays +# we settle before assembling to hopefully prevent prematurely degrading arrays +if [ -f /etc/mdadm.conf ]; then + /sbin/udevadm settle + /sbin/mdadm -Asc /etc/mdadm.conf +fi + # mount the rootfs NEWROOT="/sysroot" @@ -110,6 +117,9 @@ kill `pidof udevd` [ -x /bin/plymouth ] && /bin/plymouth --newroot=$NEWROOT +# switch any mdmon instances to newroot +[ -f /etc/mdadm.conf ] && /sbin/mdmon /proc/mdstat $NEWROOT + # FIXME: nash die die die exec /sbin/switch_root # davej doesn't like initrd bugs -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 24+ messages in thread
[parent not found: <20090205224920.18610.63979.stgit-p8uTFz9XbKjBPTuBivz2/GFmcEqAMTzPQQ4Iyu8u01E@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <20090205224920.18610.63979.stgit-p8uTFz9XbKjBPTuBivz2/GFmcEqAMTzPQQ4Iyu8u01E@public.gmane.org> @ 2009-02-06 16:40 ` Jeremy Katz [not found] ` <20090206164019.GD552-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Jeremy Katz @ 2009-02-06 16:40 UTC (permalink / raw) To: Dan Williams Cc: initramfs-u79uwXL29TY76Z2rM5mHXA, neilb-l3A5Bk7waGM, jacek.danecki-ral2JQCrhuEAvxtiuMwx3w On Thursday, February 05 2009, Dan Williams said: > index 706127f..0294502 100755 > --- a/init > +++ b/init > @@ -46,6 +46,13 @@ mknod /dev/tty1 c 4 1 > /sbin/udevd --daemon > /sbin/udevadm trigger > > +# start any defined raid arrays > +# we settle before assembling to hopefully prevent prematurely degrading arrays > +if [ -f /etc/mdadm.conf ]; then > + /sbin/udevadm settle > + /sbin/mdadm -Asc /etc/mdadm.conf > +fi > + RAID arrays should be getting started by udev rules, not by explicit calls to mdadm in /init. Yes, this means having proper integration with udev for your kernel pieces. But this ends up helping everything as it will also let us lose the multiple redundant calls to things like mdadm (and lvm, etc) throughout the boot process which should just be occurring as devices show up. > +# switch any mdmon instances to newroot > +[ -f /etc/mdadm.conf ] && /sbin/mdmon /proc/mdstat $NEWROOT > + Is there a real need for mdmon to start prior to being in the real rootfs? Jeremy -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <20090206164019.GD552-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* RE: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <20090206164019.GD552-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-02-06 16:50 ` Danecki, Jacek [not found] ` <A9DE54D0CD747C4CB06DCE5B6FA2246F4B496AFA-IGOiFh9zz4yvNW/NfzhIbrfspsVTdybXVpNB7YpNyf8@public.gmane.org> 2009-02-06 18:02 ` Dan Williams 1 sibling, 1 reply; 24+ messages in thread From: Danecki, Jacek @ 2009-02-06 16:50 UTC (permalink / raw) To: Jeremy Katz, Williams, Dan J Cc: initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org > Is there a real need for mdmon to start prior to being in the real > rootfs? mdmon is needed to change raid array to RW mode, so as long as rootfs is mounted RO, mdmon can be started in real rootfs.-- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <A9DE54D0CD747C4CB06DCE5B6FA2246F4B496AFA-IGOiFh9zz4yvNW/NfzhIbrfspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <A9DE54D0CD747C4CB06DCE5B6FA2246F4B496AFA-IGOiFh9zz4yvNW/NfzhIbrfspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2009-02-06 16:55 ` Dan Williams 2009-02-06 16:56 ` Bill Nottingham 1 sibling, 0 replies; 24+ messages in thread From: Dan Williams @ 2009-02-06 16:55 UTC (permalink / raw) To: Danecki, Jacek Cc: Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org On Fri, Feb 6, 2009 at 9:50 AM, Danecki, Jacek <jacek.danecki-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote: > >> Is there a real need for mdmon to start prior to being in the real >> rootfs? > > mdmon is needed to change raid array to RW mode, so as long as rootfs is mounted RO, mdmon can be started in real rootfs.-- No, that is what I originally thought until I tried to mount an xfs filesystem that had been uncleanly shutdown. Even if the rootfs is mounted read-only the backing device needs to be read-write to recover the journal. -- Dan -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <A9DE54D0CD747C4CB06DCE5B6FA2246F4B496AFA-IGOiFh9zz4yvNW/NfzhIbrfspsVTdybXVpNB7YpNyf8@public.gmane.org> 2009-02-06 16:55 ` Dan Williams @ 2009-02-06 16:56 ` Bill Nottingham [not found] ` <20090206165601.GF11144-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org> 1 sibling, 1 reply; 24+ messages in thread From: Bill Nottingham @ 2009-02-06 16:56 UTC (permalink / raw) To: Danecki, Jacek Cc: Jeremy Katz, Williams, Dan J, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org Danecki, Jacek (jacek.danecki-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org) said: > > Is there a real need for mdmon to start prior to being in the real > > rootfs? > > mdmon is needed to change raid array to RW mode, so as long as rootfs is mounted RO, mdmon can be started in real rootfs. So, for one particular specific type of block device, you need a daemon to switch it writable. Every other type of block device can handle this without separate tooling. I'm not seeing how this is an improvement. Bill -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <20090206165601.GF11144-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <20090206165601.GF11144-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org> @ 2009-02-06 17:27 ` Dan Williams [not found] ` <e9c3a7c20902060927j2b900940kd851573469110135-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Dan Williams @ 2009-02-06 17:27 UTC (permalink / raw) To: Bill Nottingham Cc: Danecki, Jacek, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org On Fri, Feb 6, 2009 at 9:56 AM, Bill Nottingham <notting-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > So, for one particular specific type of block device, you need > a daemon to switch it writable. Every other type of block device > can handle this without separate tooling. I'm not seeing how this > is an improvement. > It is not just setting writable, mdmon is also there to clear the bit when writes have quiesced. Raid devices have always been special in that they need to manage a dirty bit in their metadata to determine if a resync needs to be performed after a dirty shutdown. With hardware raid or pure kernel (MD metadata) raid this mechanism is hidden. External metadata raid is akin to fuse filesystems. The kernel provides the generic infrastructure and a userspace daemon handles the implementation details. The improvement is that with one kernel implementation we can support any number of metadata formats. -- Dan -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <e9c3a7c20902060927j2b900940kd851573469110135-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <e9c3a7c20902060927j2b900940kd851573469110135-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2009-02-06 17:38 ` Bill Nottingham [not found] ` <20090206173814.GA3541-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Bill Nottingham @ 2009-02-06 17:38 UTC (permalink / raw) To: Dan Williams Cc: Danecki, Jacek, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org Dan Williams (dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org) said: > It is not just setting writable, mdmon is also there to clear the bit > when writes have quiesced. Let me just see if I understand this infrastructure correctly. - device is set writable - kernel tells userspace - userspace frobs bit in superblock to say 'I want to be dirty!' - userspace tells kernel - kernel writes bit to disk ... stuff happens ... - userspace tells kernel to unmount, or remount R/O - kernel tells userspace "hey, i unmounted this" (userspace freaks out because the filesystem the daemon is running on just went away) - userspace frobs bit in superblock to say 'This array is CLEAN!' - userspace tells kernel - kernel writes bit to disk Is that really how it's supposed to work? So, why isn't the ext* journal or filesystem unclean flag handled via a userspace file monitoring daemon, then? Bill -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <20090206173814.GA3541-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <20090206173814.GA3541-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org> @ 2009-02-06 18:00 ` Jacek Danecki [not found] ` <498C7AD8.6080105-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> 2009-02-06 18:12 ` Dan Williams 1 sibling, 1 reply; 24+ messages in thread From: Jacek Danecki @ 2009-02-06 18:00 UTC (permalink / raw) To: Bill Nottingham Cc: Williams, Dan J, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org Bill Nottingham wrote: > > So, why isn't the ext* journal or filesystem unclean flag > handled via a userspace file monitoring daemon, then? Dan, Neil Are any plans about rewrite mdmon in kernel-space? -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <498C7AD8.6080105-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <498C7AD8.6080105-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> @ 2009-02-06 19:34 ` NeilBrown [not found] ` <2c0cae741a7229789cd777d93180072a.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: NeilBrown @ 2009-02-06 19:34 UTC (permalink / raw) To: Jacek Danecki Cc: Bill Nottingham, Williams, Dan J, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Sat, February 7, 2009 5:00 am, Jacek Danecki wrote: > Bill Nottingham wrote: >> >> So, why isn't the ext* journal or filesystem unclean flag >> handled via a userspace file monitoring daemon, then? > > Dan, Neil > > Are any plans about rewrite mdmon in kernel-space? > Definitely not. There is more to this than the 'unclean' flag. The really important task for mdmon (which hopefully it never has to perform...) is to record device failures (which is what RAID is really all about). If a device fails while trying to write to it, we cannot allow that write to complete until the other devices have had that device failure recorded on them. Otherwise, following an unclean shutdown we might trust the data that is on that drive, which is now out-of-date. The task of mdmon is to discover when there have been write error, record the device failure in the metadata, then allow the write to complete. It has a number of other tasks as well, but that is the important one which means that it must always be running when the array is writable. NeilBrown -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <2c0cae741a7229789cd777d93180072a.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <2c0cae741a7229789cd777d93180072a.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org> @ 2009-02-06 20:03 ` Bill Nottingham 2009-02-08 19:08 ` Szabolcs Szakacsits 0 siblings, 1 reply; 24+ messages in thread From: Bill Nottingham @ 2009-02-06 20:03 UTC (permalink / raw) To: NeilBrown Cc: Jacek Danecki, Williams, Dan J, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org NeilBrown (neilb-l3A5Bk7waGM@public.gmane.org) said: > There is more to this than the 'unclean' flag. > The really important task for mdmon (which hopefully it never has > to perform...) is to record device failures (which is what RAID is > really all about). > > If a device fails while trying to write to it, we cannot allow that > write to complete until the other devices have had that device failure > recorded on them. Otherwise, following an unclean shutdown we might trust > the data that is on that drive, which is now out-of-date. OK, so: 1) kernel sends write request. If error.... 2) <some error occurs> 3) kernel sends error to userspace 4) mdmon wakes up 5) mdmon decides where to record this 6) mdmon writes to super blocks 7) go to step one, hope you don't hit step 2 this time This now means that reliable suspend and resume is completely impossible on RAID devices, just as it is on FUSE. You can't have waking up userspace be part of your write and sync process - you've just deadlocked at step 3/4. Unless I've missed something here? Bill -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 2/3] raid: external and internal metadata support 2009-02-06 20:03 ` Bill Nottingham @ 2009-02-08 19:08 ` Szabolcs Szakacsits 0 siblings, 0 replies; 24+ messages in thread From: Szabolcs Szakacsits @ 2009-02-08 19:08 UTC (permalink / raw) To: initramfs-u79uwXL29TY76Z2rM5mHXA Bill Nottingham <notting@...> writes: > OK, so: > > 1) kernel sends write request. If error.... > 2) <some error occurs> > 3) kernel sends error to userspace > 4) mdmon wakes up > 5) mdmon decides where to record this > 6) mdmon writes to super blocks > 7) go to step one, hope you don't hit step 2 this time > > This now means that reliable suspend and resume is completely > impossible on RAID devices, just as it is on FUSE. It's not clear from the context but I suppose you mean only FUSE root file systems (e.g. what Ubuntu/WUBI has on NTFS via NTFS-3G). One of the solutions is to apply the same mechanism what swapfiles use. That avoids user space completely. The suspend information go to a dynamically (userspace involved) and a statically (no userspace involved) allocated space on the suspend device. Regards, Szaka -- NTFS-3G: http://ntfs-3g.org -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <20090206173814.GA3541-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org> 2009-02-06 18:00 ` Jacek Danecki @ 2009-02-06 18:12 ` Dan Williams [not found] ` <e9c3a7c20902061012w15a31e7br6ce2074b7b9db555-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2009-02-08 19:16 ` Szabolcs Szakacsits 1 sibling, 2 replies; 24+ messages in thread From: Dan Williams @ 2009-02-06 18:12 UTC (permalink / raw) To: Bill Nottingham Cc: Danecki, Jacek, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org On Fri, Feb 6, 2009 at 10:38 AM, Bill Nottingham <notting-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > Dan Williams (dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org) said: >> It is not just setting writable, mdmon is also there to clear the bit >> when writes have quiesced. > > Let me just see if I understand this infrastructure correctly. > > - device is set writable > - kernel tells userspace ...tells userspace that we want to transition the array from clean to dirty, yes > - userspace frobs bit in superblock to say 'I want to be dirty!' yes > - userspace tells kernel ...yup array is dirty, start writing. > - kernel writes bit to disk > ... stuff happens ... > - userspace tells kernel to unmount, or remount R/O > - kernel tells userspace "hey, i unmounted this" > (userspace freaks out because the filesystem the daemon is running on > just went away) mdmon does not know or care if the *filesystem* is read-only. It is reading and writing /proc, /sys, and the raw disk devices. > - userspace frobs bit in superblock to say 'This array is CLEAN!' ...not in this scenario no. > - userspace tells kernel > - kernel writes bit to disk > > Is that really how it's supposed to work? You lost me at userspace freaks out, but that is the general flow. > So, why isn't the ext* journal or filesystem unclean flag > handled via a userspace file monitoring daemon, then? I'm not trying to be obtuse, but because it isn't. Put another way, consider what extra tools the initramfs would need if we wanted to support an ntfs-3g rootfs. -- Dan -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <e9c3a7c20902061012w15a31e7br6ce2074b7b9db555-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <e9c3a7c20902061012w15a31e7br6ce2074b7b9db555-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2009-02-06 18:21 ` Bill Nottingham [not found] ` <20090206182118.GA4413-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Bill Nottingham @ 2009-02-06 18:21 UTC (permalink / raw) To: Dan Williams Cc: Danecki, Jacek, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org Dan Williams (dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org) said: > > - kernel tells userspace "hey, i unmounted this" > > (userspace freaks out because the filesystem the daemon is running on > > just went away) > mdmon does not know or care if the *filesystem* is read-only. It is > reading and writing /proc, /sys, and the raw disk devices. Your daemon has to be running from somewhere. That tends to be reduced to the initramfs that you've already deleted and switchrooted away from (in which case, good luck on upgrades of your userspace tools - you're stuck with the version in the initramfs you booted from, even if your later userspace tools end up using some later protocol). You can't run it from the rootfs, because that's a chicken/egg scenario (or you'll switch to it, and then be unable to mark it clean, because you can't mark the array r/o, because the filesystem is r/w, which you can't undo because the daemon is running on it...) Long-lived daemons running from the initramfs aren't really good. We don't run udev that way. > > - userspace frobs bit in superblock to say 'This array is CLEAN!' > ...not in this scenario no. Then when is the clean bit set? > > So, why isn't the ext* journal or filesystem unclean flag > > handled via a userspace file monitoring daemon, then? > > I'm not trying to be obtuse, but because it isn't. Put another way, > consider what extra tools the initramfs would need if we wanted to > support an ntfs-3g rootfs. You're asking for *the exact same thing*... just RAID specific. Bill -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <20090206182118.GA4413-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <20090206182118.GA4413-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org> @ 2009-02-06 19:19 ` Dan Williams [not found] ` <e9c3a7c20902061119i2120cc5fpda0a5cdc3aedc17b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Dan Williams @ 2009-02-06 19:19 UTC (permalink / raw) To: Bill Nottingham Cc: Danecki, Jacek, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org On Fri, Feb 6, 2009 at 11:21 AM, Bill Nottingham <notting-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > Dan Williams (dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org) said: >> > - kernel tells userspace "hey, i unmounted this" >> > (userspace freaks out because the filesystem the daemon is running on >> > just went away) >> mdmon does not know or care if the *filesystem* is read-only. It is >> reading and writing /proc, /sys, and the raw disk devices. > > Your daemon has to be running from somewhere. That tends to be > reduced to the initramfs that you've already deleted and switchrooted > away from (in which case, good luck on upgrades of your userspace tools - > you're stuck with the version in the initramfs you booted from, even > if your later userspace tools end up using some later protocol). Actually no, your not necessarily stuck with the mdmon from boot. In a pinch you could "mdmon /proc/mdstat /". Worse case you need to re-dracut and reboot, but that is already more flexible than the metadata handled in kernel-space approach. > You can't run it from the rootfs, because that's a chicken/egg > scenario (or you'll switch to it, and then be unable to mark it > clean, because you can't mark the array r/o, because the filesystem > is r/w, which you can't undo because the daemon is running on > it...) Array r/o is a separate issue from the raid metadata clean bit, see below. > Long-lived daemons running from the initramfs aren't really good. I agree, and I initially looked for ways to wait until the rootfs was available before launching mdmon... then I hit the xfs journal recovery case. > We don't run udev that way. At first glance it looks like plymouth is run this way, but I am probably mistaken. From dracut/init: [ -x /bin/plymouth ] && /bin/plymouth --newroot=$NEWROOT One might say "just set the dirty bit, terminate, and wait for the mdmon in the rootfs to take over". The problem is that a disk could fail in this window, and this event needs to be handled before the kernel does anything else to the array. > >> > - userspace frobs bit in superblock to say 'This array is CLEAN!' >> ...not in this scenario no. > > Then when is the clean bit set? The clean bit can be set as soon as the parity data is in sync with the data on the other drives. We typically wait for some period of write-inactivity to avoid needlessly touching the metadata after every write. >> > So, why isn't the ext* journal or filesystem unclean flag >> > handled via a userspace file monitoring daemon, then? >> >> I'm not trying to be obtuse, but because it isn't. Put another way, >> consider what extra tools the initramfs would need if we wanted to >> support an ntfs-3g rootfs. > > You're asking for *the exact same thing*... just RAID specific. > The key difference being that there are performance reasons for handling filesystem metadata in the kernel. Raid metadata events are always in the slow path. -- Dan -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <e9c3a7c20902061119i2120cc5fpda0a5cdc3aedc17b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <e9c3a7c20902061119i2120cc5fpda0a5cdc3aedc17b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2009-02-06 20:08 ` Bill Nottingham [not found] ` <20090206200818.GC6150-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Bill Nottingham @ 2009-02-06 20:08 UTC (permalink / raw) To: Dan Williams Cc: Danecki, Jacek, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org Dan Williams (dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org) said: > Actually no, your not necessarily stuck with the mdmon from boot. In > a pinch you could "mdmon /proc/mdstat /". Not really. You state: > One might say "just set the dirty bit, terminate, and wait for the > mdmon in the rootfs to take over". The problem is that a disk could > fail in this window, and this event needs to be handled before the > kernel does anything else to the array. ... > The clean bit can be set as soon as the parity data is in sync with > the data on the other drives. We typically wait for some period of > write-inactivity to avoid needlessly touching the metadata after every > write. You shut down the machine. After a while, you get to the point where you're getting ready to unmount the filesystem. Since mdmon's running on it (if you started it post boot), you have to kill it. After that point, there are going to be writes (a final sync, if nothing else, when you unmount the filesystem.) And you won't be able to set any RAID metadata flags then, as the daemon won't be running. So, doing a later run of "mdmon /proc/mdstat" doesn't fully protect you. Bill -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <20090206200818.GC6150-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <20090206200818.GC6150-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org> @ 2009-02-06 20:21 ` NeilBrown [not found] ` <8c48d75b834c74adc39b6e904a44237e.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org> 2009-02-06 20:26 ` Dan Williams 1 sibling, 1 reply; 24+ messages in thread From: NeilBrown @ 2009-02-06 20:21 UTC (permalink / raw) To: Bill Nottingham Cc: Dan Williams, Danecki, Jacek, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Sat, February 7, 2009 7:08 am, Bill Nottingham wrote: > Dan Williams (dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org) said: >> Actually no, your not necessarily stuck with the mdmon from boot. In >> a pinch you could "mdmon /proc/mdstat /". > > Not really. > > You state: > >> One might say "just set the dirty bit, terminate, and wait for the >> mdmon in the rootfs to take over". The problem is that a disk could >> fail in this window, and this event needs to be handled before the >> kernel does anything else to the array. > ... >> The clean bit can be set as soon as the parity data is in sync with >> the data on the other drives. We typically wait for some period of >> write-inactivity to avoid needlessly touching the metadata after every >> write. > > You shut down the machine. After a while, you get to the point where > you're getting ready to unmount the filesystem. Since mdmon's running > on it (if you started it post boot), you have to kill it. After that > point, there are going to be writes (a final sync, if nothing else, > when you unmount the filesystem.) And you won't be able to set any > RAID metadata flags then, as the daemon won't be running. So, doing > a later run of "mdmon /proc/mdstat" doesn't fully protect you. ??? Last time I checked, Linux would not unmount the root filesystem. It just remounts it 'read-only'. Is that going to change? NeilBrown -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <8c48d75b834c74adc39b6e904a44237e.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <8c48d75b834c74adc39b6e904a44237e.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org> @ 2009-02-06 20:27 ` Bill Nottingham 0 siblings, 0 replies; 24+ messages in thread From: Bill Nottingham @ 2009-02-06 20:27 UTC (permalink / raw) To: NeilBrown Cc: Dan Williams, Danecki, Jacek, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org NeilBrown (neilb-l3A5Bk7waGM@public.gmane.org) said: > > You shut down the machine. After a while, you get to the point where > > you're getting ready to unmount the filesystem. Since mdmon's running > > on it (if you started it post boot), you have to kill it. After that > > point, there are going to be writes (a final sync, if nothing else, > > when you unmount the filesystem.) And you won't be able to set any > > RAID metadata flags then, as the daemon won't be running. So, doing > > a later run of "mdmon /proc/mdstat" doesn't fully protect you. > > Last time I checked, Linux would not unmount the root filesystem. > It just remounts it 'read-only'. > Is that going to change? Yeah, I screwed up that part. However, it still syncs, and the mdmon process will still be dead. Bill -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <20090206200818.GC6150-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org> 2009-02-06 20:21 ` NeilBrown @ 2009-02-06 20:26 ` Dan Williams [not found] ` <e9c3a7c20902061226m3f1e9e55pc2986a8527ade77-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 24+ messages in thread From: Dan Williams @ 2009-02-06 20:26 UTC (permalink / raw) To: Bill Nottingham Cc: Danecki, Jacek, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org On Fri, Feb 6, 2009 at 1:08 PM, Bill Nottingham <notting-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > Dan Williams (dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org) said: >> Actually no, your not necessarily stuck with the mdmon from boot. In >> a pinch you could "mdmon /proc/mdstat /". > > Not really. > > You state: > >> One might say "just set the dirty bit, terminate, and wait for the >> mdmon in the rootfs to take over". The problem is that a disk could >> fail in this window, and this event needs to be handled before the >> kernel does anything else to the array. > ... >> The clean bit can be set as soon as the parity data is in sync with >> the data on the other drives. We typically wait for some period of >> write-inactivity to avoid needlessly touching the metadata after every >> write. > > You shut down the machine. After a while, you get to the point where > you're getting ready to unmount the filesystem. Since mdmon's running > on it (if you started it post boot), you have to kill it. After that > point, there are going to be writes (a final sync, if nothing else, > when you unmount the filesystem.) And you won't be able to set any > RAID metadata flags then, as the daemon won't be running. So, doing > a later run of "mdmon /proc/mdstat" doesn't fully protect you. > mdmon needs some coordination with the shutdown scripts to be kept alive until the rootfs is marked readonly... actually up until the point where the rootdev can be marked readonly. If you take a look at Debian's killall implementation it has provisions to exclude fuse and other critical userspace process from killall. A similar exclusion is needed for mdmon. -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <e9c3a7c20902061226m3f1e9e55pc2986a8527ade77-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <e9c3a7c20902061226m3f1e9e55pc2986a8527ade77-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2009-07-09 20:19 ` Warren Togami 0 siblings, 0 replies; 24+ messages in thread From: Warren Togami @ 2009-07-09 20:19 UTC (permalink / raw) To: Dan Williams Cc: Bill Nottingham, Danecki, Jacek, Jeremy Katz, initramfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, neilb-l3A5Bk7waGM@public.gmane.org On 02/06/2009 03:26 PM, Dan Williams wrote: > On Fri, Feb 6, 2009 at 1:08 PM, Bill Nottingham<notting-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: >> Dan Williams (dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org) said: >>> Actually no, your not necessarily stuck with the mdmon from boot. In >>> a pinch you could "mdmon /proc/mdstat /". >> Not really. >> >> You state: >> >>> One might say "just set the dirty bit, terminate, and wait for the >>> mdmon in the rootfs to take over". The problem is that a disk could >>> fail in this window, and this event needs to be handled before the >>> kernel does anything else to the array. >> ... >>> The clean bit can be set as soon as the parity data is in sync with >>> the data on the other drives. We typically wait for some period of >>> write-inactivity to avoid needlessly touching the metadata after every >>> write. >> You shut down the machine. After a while, you get to the point where >> you're getting ready to unmount the filesystem. Since mdmon's running >> on it (if you started it post boot), you have to kill it. After that >> point, there are going to be writes (a final sync, if nothing else, >> when you unmount the filesystem.) And you won't be able to set any >> RAID metadata flags then, as the daemon won't be running. So, doing >> a later run of "mdmon /proc/mdstat" doesn't fully protect you. >> > > mdmon needs some coordination with the shutdown scripts to be kept > alive until the rootfs is marked readonly... actually up until the > point where the rootdev can be marked readonly. > > If you take a look at Debian's killall implementation it has > provisions to exclude fuse and other critical userspace process from > killall. A similar exclusion is needed for mdmon. It appears we have no solution for this yet in Fedora 12. https://bugzilla.redhat.com/show_bug.cgi?id=496843 This bug has a similar request for network block devices that need a userspace process. Warren Togami wtogami-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 2/3] raid: external and internal metadata support 2009-02-06 18:12 ` Dan Williams [not found] ` <e9c3a7c20902061012w15a31e7br6ce2074b7b9db555-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2009-02-08 19:16 ` Szabolcs Szakacsits 1 sibling, 0 replies; 24+ messages in thread From: Szabolcs Szakacsits @ 2009-02-08 19:16 UTC (permalink / raw) To: initramfs-u79uwXL29TY76Z2rM5mHXA Dan Williams <dan.j.williams@...> writes: > consider what extra tools the initramfs would need if we wanted to > support an ntfs-3g rootfs. The FUSE kernel module. Nothing else. Several distros do it. Regards, Szaka NTFS-3G: http://ntfs-3g.org -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC PATCH 2/3] raid: external and internal metadata support [not found] ` <20090206164019.GD552-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-02-06 16:50 ` Danecki, Jacek @ 2009-02-06 18:02 ` Dan Williams 1 sibling, 0 replies; 24+ messages in thread From: Dan Williams @ 2009-02-06 18:02 UTC (permalink / raw) To: Jeremy Katz Cc: initramfs-u79uwXL29TY76Z2rM5mHXA, neilb-l3A5Bk7waGM, jacek.danecki-ral2JQCrhuEAvxtiuMwx3w, Kay Sievers On Fri, Feb 6, 2009 at 9:40 AM, Jeremy Katz <katzj-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > RAID arrays should be getting started by udev rules, not by explicit > calls to mdadm in /init. Yes, this means having proper integration with > udev for your kernel pieces. But this ends up helping everything as it > will also let us lose the multiple redundant calls to things like mdadm > (and lvm, etc) throughout the boot process which should just be > occurring as devices show up. The trick is determining when a device has not shown up yet versus it will never show up... to prevent the array being marked degraded prematurely. Is there some mechanism for udev to broadcast "if you were waiting for more devices to show up don't hold your breath"? I.e. at the point where a call to "udevadm settle" would reasonably be expected to not find any pending events? I am thinking something along the lines of: <udev: add disk> mdadm --incremental --no-degraded $dev <udev: add disk> mdadm --incremental --no-degraded $dev <udev: probably no more devices> mdadm --incremental $last_dev -- Dan -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* [RFC PATCH 3/3] add more disk id helpers to udevexe [not found] ` <20090205224808.18610.14957.stgit-p8uTFz9XbKjBPTuBivz2/GFmcEqAMTzPQQ4Iyu8u01E@public.gmane.org> 2009-02-05 22:49 ` [RFC PATCH 1/3] gen-mod-lists: create lists of modules that may talk to a root device Dan Williams 2009-02-05 22:49 ` [RFC PATCH 2/3] raid: external and internal metadata support Dan Williams @ 2009-02-05 22:49 ` Dan Williams 2 siblings, 0 replies; 24+ messages in thread From: Dan Williams @ 2009-02-05 22:49 UTC (permalink / raw) To: initramfs-u79uwXL29TY76Z2rM5mHXA Cc: neilb-l3A5Bk7waGM, jacek.danecki-ral2JQCrhuEAvxtiuMwx3w Allow udev to create /dev/disk/by-id links Signed-off-by: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> --- dracut | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/dracut b/dracut index 513b6c6..da86a23 100755 --- a/dracut +++ b/dracut @@ -73,7 +73,7 @@ raidexe="/sbin/mdadm /sbin/mdmon" # and some things that are nice for debugging debugexe="/bin/ls /bin/cat /bin/ln /bin/ps /bin/grep /bin/more" # udev things we care about -udevexe="/lib/udev/vol_id /lib/udev/console_init" +udevexe="/lib/udev/*_id /lib/udev/console_init" # install base files for binary in $exe $debugexe $udevexe $lvmexe $cryptexe $raidexe ; do -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 24+ messages in thread
end of thread, other threads:[~2009-07-09 20:19 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-05 22:49 [RFC PATCH 0/3] mdraid rootfs support Dan Williams
[not found] ` <20090205224808.18610.14957.stgit-p8uTFz9XbKjBPTuBivz2/GFmcEqAMTzPQQ4Iyu8u01E@public.gmane.org>
2009-02-05 22:49 ` [RFC PATCH 1/3] gen-mod-lists: create lists of modules that may talk to a root device Dan Williams
2009-02-05 22:49 ` [RFC PATCH 2/3] raid: external and internal metadata support Dan Williams
[not found] ` <20090205224920.18610.63979.stgit-p8uTFz9XbKjBPTuBivz2/GFmcEqAMTzPQQ4Iyu8u01E@public.gmane.org>
2009-02-06 16:40 ` Jeremy Katz
[not found] ` <20090206164019.GD552-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-02-06 16:50 ` Danecki, Jacek
[not found] ` <A9DE54D0CD747C4CB06DCE5B6FA2246F4B496AFA-IGOiFh9zz4yvNW/NfzhIbrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2009-02-06 16:55 ` Dan Williams
2009-02-06 16:56 ` Bill Nottingham
[not found] ` <20090206165601.GF11144-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
2009-02-06 17:27 ` Dan Williams
[not found] ` <e9c3a7c20902060927j2b900940kd851573469110135-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-02-06 17:38 ` Bill Nottingham
[not found] ` <20090206173814.GA3541-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
2009-02-06 18:00 ` Jacek Danecki
[not found] ` <498C7AD8.6080105-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2009-02-06 19:34 ` NeilBrown
[not found] ` <2c0cae741a7229789cd777d93180072a.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org>
2009-02-06 20:03 ` Bill Nottingham
2009-02-08 19:08 ` Szabolcs Szakacsits
2009-02-06 18:12 ` Dan Williams
[not found] ` <e9c3a7c20902061012w15a31e7br6ce2074b7b9db555-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-02-06 18:21 ` Bill Nottingham
[not found] ` <20090206182118.GA4413-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
2009-02-06 19:19 ` Dan Williams
[not found] ` <e9c3a7c20902061119i2120cc5fpda0a5cdc3aedc17b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-02-06 20:08 ` Bill Nottingham
[not found] ` <20090206200818.GC6150-Zdt1ptygihhQcNjhGXsBABcY2uh10dtjAL8bYrjMMd8@public.gmane.org>
2009-02-06 20:21 ` NeilBrown
[not found] ` <8c48d75b834c74adc39b6e904a44237e.squirrel-eq65iwfR9nKIECXXMXunQA@public.gmane.org>
2009-02-06 20:27 ` Bill Nottingham
2009-02-06 20:26 ` Dan Williams
[not found] ` <e9c3a7c20902061226m3f1e9e55pc2986a8527ade77-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-07-09 20:19 ` Warren Togami
2009-02-08 19:16 ` Szabolcs Szakacsits
2009-02-06 18:02 ` Dan Williams
2009-02-05 22:49 ` [RFC PATCH 3/3] add more disk id helpers to udevexe Dan Williams
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox