* Partitioned raid and major number @ 2004-02-25 14:56 Miquel van Smoorenburg 2004-02-25 18:46 ` H. Peter Anvin 2004-02-25 23:25 ` Neil Brown 0 siblings, 2 replies; 23+ messages in thread From: Miquel van Smoorenburg @ 2004-02-25 14:56 UTC (permalink / raw) To: linux-raid Hello, I see that Linus merged partitioned raid into bitkeeper. The major number of partitioned raid devices is allocated dynamically. I want to set up a server with 2 disks in RAID1 mode, partitioned. To be able to boot from it, the RAID1 device needs to have a fixed major number (I don't want to be forced to use an initrd). Is it planned to ask LANANA for a fixed major number? If not, would a patch to pass the major number on the kernel command line be accepted ? Thanks, Mike. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-02-25 14:56 Partitioned raid and major number Miquel van Smoorenburg @ 2004-02-25 18:46 ` H. Peter Anvin 2004-02-25 23:25 ` Neil Brown 1 sibling, 0 replies; 23+ messages in thread From: H. Peter Anvin @ 2004-02-25 18:46 UTC (permalink / raw) To: linux-raid Followup to: <20040225145624.GA1513@cistron.nl> By author: Miquel van Smoorenburg <miquels@cistron.nl> In newsgroup: linux.dev.raid > > Hello, > > I see that Linus merged partitioned raid into bitkeeper. > The major number of partitioned raid devices is allocated dynamically. > > I want to set up a server with 2 disks in RAID1 mode, partitioned. > To be able to boot from it, the RAID1 device needs to have a fixed > major number (I don't want to be forced to use an initrd). Is it > planned to ask LANANA for a fixed major number? If not, would a > patch to pass the major number on the kernel command line be accepted ? > Please ask <device@lanana.org> for a device number. Make sure to specify if it's 2.6-specific (and hence will be assigned a number above 255) or not. -hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-02-25 14:56 Partitioned raid and major number Miquel van Smoorenburg 2004-02-25 18:46 ` H. Peter Anvin @ 2004-02-25 23:25 ` Neil Brown 2004-02-26 21:51 ` Miquel van Smoorenburg 1 sibling, 1 reply; 23+ messages in thread From: Neil Brown @ 2004-02-25 23:25 UTC (permalink / raw) To: Miquel van Smoorenburg; +Cc: linux-raid On Wednesday February 25, miquels@cistron.nl wrote: > Hello, > > I see that Linus merged partitioned raid into bitkeeper. > The major number of partitioned raid devices is allocated dynamically. > > I want to set up a server with 2 disks in RAID1 mode, partitioned. > To be able to boot from it, the RAID1 device needs to have a fixed > major number (I don't want to be forced to use an initrd). Is it > planned to ask LANANA for a fixed major number? If not, would a > patch to pass the major number on the kernel command line be accepted ? The lack of a statically allocate device number is not the problem. You can have a kernel parameter that says root=/dev/md_d0p1 and it should manage to find the device thanks to sysfs. The bit that you cannot do yet is assemble the array as a partitionable array rather than a non-partitionable array. Would you be willing to try the following patch? With it: If you put raid=partitionable or just raid=part as a kernel parameter, then all auto-detected raid arrays will be partitionable, using the dynamically allocated major. Also, if you use e.g. "md=0,/dev/sda,/dev/sdb" to assemble your arrays at boot time, you can now use: "md=d0,/dev/sda,/dev/sdb" to assemble as a partitionable array (so it will be /dev/md/d0 instead of /dev/md0. Hence the 'd'). I use md= to assembly my root arrays, so I would now use: md=d0,/dev/sda,/dev/sdb root=/dev/md_d0p1 to use the first partition of the raid array on sda and sdb as my root filesystem. Please let me know if you try it an whether it works. I'm will be testing it in a day or so. NeilBrown ----------- Diffstat output ------------ ./drivers/md/md.c | 19 ++++++--- ./init/do_mounts_md.c | 99 +++++++++++++++++++++++++++++--------------------- 2 files changed, 70 insertions(+), 48 deletions(-) diff ./drivers/md/md.c~current~ ./drivers/md/md.c --- ./drivers/md/md.c~current~ 2004-02-24 11:49:11.000000000 +1100 +++ ./drivers/md/md.c 2004-02-26 10:17:05.000000000 +1100 @@ -60,7 +60,7 @@ #ifndef MODULE -static void autostart_arrays (void); +static void autostart_arrays (int part); #endif static mdk_personality_t *pers[MAX_PERSONALITY]; @@ -1795,7 +1795,7 @@ static void autorun_array(mddev_t *mddev * * If "unit" is allocated, then bump its reference count */ -static void autorun_devices(void) +static void autorun_devices(int part) { struct list_head candidates; struct list_head *tmp; @@ -1828,7 +1828,12 @@ static void autorun_devices(void) bdevname(rdev0->bdev, b), rdev0->preferred_minor); break; } - dev = MKDEV(MD_MAJOR, rdev0->preferred_minor); + if (part) + dev = MKDEV(mdp_major, + rdev0->preferred_minor << MdpMinorShift); + else + dev = MKDEV(MD_MAJOR, rdev0->preferred_minor); + md_probe(dev, NULL, NULL); mddev = mddev_find(dev); if (!mddev) { @@ -1925,7 +1930,7 @@ static int autostart_array(dev_t startde /* * possibly return codes */ - autorun_devices(); + autorun_devices(0); return 0; } @@ -2507,7 +2512,7 @@ static int md_ioctl(struct inode *inode, #ifndef MODULE case RAID_AUTORUN: err = 0; - autostart_arrays(); + autostart_arrays(arg); goto done; #endif default:; @@ -3685,7 +3690,7 @@ void md_autodetect_dev(dev_t dev) } -static void autostart_arrays(void) +static void autostart_arrays(int part) { char b[BDEVNAME_SIZE]; mdk_rdev_t *rdev; @@ -3710,7 +3715,7 @@ static void autostart_arrays(void) } dev_cnt = 0; - autorun_devices(); + autorun_devices(part); } #endif diff ./init/do_mounts_md.c~current~ ./init/do_mounts_md.c --- ./init/do_mounts_md.c~current~ 2004-02-26 09:02:33.000000000 +1100 +++ ./init/do_mounts_md.c 2004-02-26 10:15:07.000000000 +1100 @@ -12,14 +12,17 @@ * The code for that is here. */ -static int __initdata raid_noautodetect; +static int __initdata raid_noautodetect, raid_autopart; static struct { - char device_set [MAX_MD_DEVS]; - int pers[MAX_MD_DEVS]; - int chunk[MAX_MD_DEVS]; - char *device_names[MAX_MD_DEVS]; -} md_setup_args __initdata; + int minor; + int partitioned; + int pers; + int chunk; + char *device_names; +} md_setup_args[MAX_MD_DEVS] __initdata; + +static int md_setup_ents __initdata; /* * Parse the command-line parameters given our kernel, but do not @@ -43,21 +46,37 @@ static struct { */ static int __init md_setup(char *str) { - int minor, level, factor, fault, pers; + int minor, level, factor, fault, pers, partitioned = 0; char *pername = ""; - char *str1 = str; + char *str1; + int ent; + if (*str == 'd') { + partitioned = 1; + str++; + } if (get_option(&str, &minor) != 2) { /* MD Number */ printk(KERN_WARNING "md: Too few arguments supplied to md=.\n"); return 0; } + str1 = str; if (minor >= MAX_MD_DEVS) { printk(KERN_WARNING "md: md=%d, Minor device number too high.\n", minor); return 0; - } else if (md_setup_args.device_names[minor]) { - printk(KERN_WARNING "md: md=%d, Specified more than once. " - "Replacing previous definition.\n", minor); } + for (ent=0 ; ent< md_setup_ents ; ent++) + if (md_setup_args[ent].minor == minor && + md_setup_args[ent].partitioned == partitioned) { + printk(KERN_WARNING "md: md=%s%d, Specified more than once. " + "Replacing previous definition.\n", partitioned?"d":"", minor); + break; + } + if (ent >= MAX_MD_DEVS) { + printk(KERN_WARNING "md: md=%s%d - too many md initialisations\n", partitioned?"d":"", minor); + return 0; + } + if (ent >= md_setup_ents) + md_setup_ents++; switch (get_option(&str, &level)) { /* RAID Personality */ case 2: /* could be 0 or -1.. */ if (level == 0 || level == LEVEL_LINEAR) { @@ -66,24 +85,16 @@ static int __init md_setup(char *str) printk(KERN_WARNING "md: Too few arguments supplied to md=.\n"); return 0; } - md_setup_args.pers[minor] = level; - md_setup_args.chunk[minor] = 1 << (factor+12); - switch(level) { - case LEVEL_LINEAR: + md_setup_args[ent].pers = level; + md_setup_args[ent].chunk = 1 << (factor+12); + if (level == LEVEL_LINEAR) { pers = LINEAR; pername = "linear"; - break; - case 0: + } else { pers = RAID0; pername = "raid0"; - break; - default: - printk(KERN_WARNING - "md: The kernel has not been configured for raid%d support!\n", - level); - return 0; } - md_setup_args.pers[minor] = pers; + md_setup_args[ent].pers = pers; break; } /* FALL THROUGH */ @@ -91,35 +102,38 @@ static int __init md_setup(char *str) str = str1; /* FALL THROUGH */ case 0: - md_setup_args.pers[minor] = 0; + md_setup_args[ent].pers = 0; pername="super-block"; } printk(KERN_INFO "md: Will configure md%d (%s) from %s, below.\n", minor, pername, str); - md_setup_args.device_names[minor] = str; + md_setup_args[ent].device_names = str; + md_setup_args[ent].partitioned = partitioned; + md_setup_args[ent].minor = minor; return 1; } static void __init md_setup_drive(void) { - int minor, i; + int minor, i, ent, partitioned; dev_t dev; dev_t devices[MD_SB_DISKS+1]; - for (minor = 0; minor < MAX_MD_DEVS; minor++) { + for (ent = 0; ent < md_setup_ents ; ent++) { int fd; int err = 0; char *devname; mdu_disk_info_t dinfo; char name[16], devfs_name[16]; - if (!(devname = md_setup_args.device_names[minor])) - continue; - - sprintf(name, "/dev/md%d", minor); - sprintf(devfs_name, "/dev/md/%d", minor); + minor = md_setup_args[ent].minor; + partitioned = md_setup_args[ent].partitioned; + devname = md_setup_args[ent].device_names; + + sprintf(name, "/dev/md%s%d", partitioned?"_d":"", minor); + sprintf(devfs_name, "/dev/md/%s%d", partitioned?"d":"", minor); create_dev(name, MKDEV(MD_MAJOR, minor), devfs_name); for (i = 0; i < MD_SB_DISKS && devname != 0; i++) { char *p; @@ -143,20 +157,19 @@ static void __init md_setup_drive(void) } devices[i] = dev; - md_setup_args.device_set[minor] = 1; devname = p; } devices[i] = 0; - if (!md_setup_args.device_set[minor]) + if (!i) continue; - printk(KERN_INFO "md: Loading md%d: %s\n", minor, md_setup_args.device_names[minor]); + printk(KERN_INFO "md: Loading md%s%d: %s\n", partitioned?"_d":"", minor, md_setup_args[ent].device_names); fd = open(name, 0, 0); if (fd < 0) { - printk(KERN_ERR "md: open failed - cannot start array %d\n", minor); + printk(KERN_ERR "md: open failed - cannot start array %s\n", name); continue; } if (sys_ioctl(fd, SET_ARRAY_INFO, 0) == -EBUSY) { @@ -167,10 +180,10 @@ static void __init md_setup_drive(void) continue; } - if (md_setup_args.pers[minor]) { + if (md_setup_args[ent].pers) { /* non-persistent */ mdu_array_info_t ainfo; - ainfo.level = pers_to_level(md_setup_args.pers[minor]); + ainfo.level = pers_to_level(md_setup_args[ent].pers); ainfo.size = 0; ainfo.nr_disks =0; ainfo.raid_disks =0; @@ -181,7 +194,7 @@ static void __init md_setup_drive(void) ainfo.state = (1 << MD_SB_CLEAN); ainfo.layout = 0; - ainfo.chunk_size = md_setup_args.chunk[minor]; + ainfo.chunk_size = md_setup_args[ent].chunk; err = sys_ioctl(fd, SET_ARRAY_INFO, (long)&ainfo); for (i = 0; !err && i <= MD_SB_DISKS; i++) { dev = devices[i]; @@ -229,6 +242,10 @@ static int __init raid_setup(char *str) if (!strncmp(str, "noautodetect", wlen)) raid_noautodetect = 1; + if (strncmp(str, "partitionable", wlen)==0) + raid_autopart = 1; + if (strncmp(str, "part", wlen)==0) + raid_autopart = 1; pos += wlen+1; } return 1; @@ -245,7 +262,7 @@ void __init md_run_setup(void) else { int fd = open("/dev/md0", 0, 0); if (fd >= 0) { - sys_ioctl(fd, RAID_AUTORUN, 0); + sys_ioctl(fd, RAID_AUTORUN, raid_autopart); close(fd); } } ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-02-25 23:25 ` Neil Brown @ 2004-02-26 21:51 ` Miquel van Smoorenburg 2004-02-27 0:21 ` Neil Brown 0 siblings, 1 reply; 23+ messages in thread From: Miquel van Smoorenburg @ 2004-02-26 21:51 UTC (permalink / raw) To: Neil Brown; +Cc: Miquel van Smoorenburg, linux-raid On Thu, 26 Feb 2004 00:25:46, Neil Brown wrote: > On Wednesday February 25, miquels@cistron.nl wrote: > > Hello, > > > > I see that Linus merged partitioned raid into bitkeeper. > > The major number of partitioned raid devices is allocated dynamically. > > > > I want to set up a server with 2 disks in RAID1 mode, partitioned. > > To be able to boot from it, the RAID1 device needs to have a fixed > > major number (I don't want to be forced to use an initrd). Is it > > planned to ask LANANA for a fixed major number? If not, would a > > patch to pass the major number on the kernel command line be accepted ? > > The lack of a statically allocate device number is not the problem. > You can have a kernel parameter that says > root=/dev/md_d0p1 > and it should manage to find the device thanks to sysfs. > The bit that you cannot do yet is assemble the array as a > partitionable array rather than a non-partitionable array. Hmm, is there anyone who has > 128 MD devices on one system? If not, why not use the top 128 majors for, say, 8 MD devices each with 16 partitions ? Then the kernel command line option "md=0,/dev/sda,/dev/sdb,part" would create a partitionable md0 array. And if you don't add "part" things stay as they were with 256 MD majors. That's just a suggestion. Feel free to completely ignore it (as you probably will since dynamically allocated majors, initrd and udev are the future some people say ..) > Would you be willing to try the following patch? Sure. > With it: > If you put > raid=partitionable > or just > raid=part > as a kernel parameter, then all auto-detected raid arrays will be > partitionable, using the dynamically allocated major. > > Also, if you use e.g. "md=0,/dev/sda,/dev/sdb" to assemble your arrays > at boot time, you can now use: > "md=d0,/dev/sda,/dev/sdb" > to assemble as a partitionable array (so it will be /dev/md/d0 instead > of /dev/md0. Hence the 'd'). I did exactly that on a server with /dev/sda and /dev/sdb (those are 2 SATA disks through libata). Since /dev/sda contains the currently installed system I marked it as failed in /etc/raidtab before creating the array. But it doesn't work. # cat /proc/cmdline auto BOOT_IMAGE=Linux ro root=801 raid=partitionable md=d0,/dev/sda,/dev/sdb # ls /sys/block md0 sda sdb # ls /sys/block/md0 dev range size stat # dmesg | grep md: md: Will configure md0 (super-block) from /dev/sda,/dev/sdb, below. md: raid1 personality registered as nr 3 md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. md: Loading md_d0: /dev/sda md: invalid raid superblock magic on sda md: sda has invalid sb, not importing! md: md_import_device returned -22 md: bind<sdb> Is it because the first disk is invalid ? That shouldn't happen, right? Thanks, Mike. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-02-26 21:51 ` Miquel van Smoorenburg @ 2004-02-27 0:21 ` Neil Brown 2004-02-27 1:17 ` Neil Brown 0 siblings, 1 reply; 23+ messages in thread From: Neil Brown @ 2004-02-27 0:21 UTC (permalink / raw) To: Miquel van Smoorenburg; +Cc: linux-raid On Thursday February 26, miquels@cistron.nl wrote: > > Hmm, is there anyone who has > 128 MD devices on one system? If not, why not > use the top 128 majors for, say, 8 MD devices each with 16 partitions ? > Then the kernel command line option "md=0,/dev/sda,/dev/sdb,part" would create > a partitionable md0 array. And if you don't add "part" things stay as they > were with 256 MD majors. Maybe there is someone with > 128 MD arrays. There is no way to find out except to break it and wait about a year or two. I did consider having some partitionable and some non-partitionable arrays under the one MAJOR. It would be technically quite easy, but I think it is conceptually harder to work with - the mapping from minor number to device would not be what people have come to expect. > > With it: > > If you put > > raid=partitionable > > or just > > raid=part > > as a kernel parameter, then all auto-detected raid arrays will be > > partitionable, using the dynamically allocated major. > > > > Also, if you use e.g. "md=0,/dev/sda,/dev/sdb" to assemble your arrays > > at boot time, you can now use: > > "md=d0,/dev/sda,/dev/sdb" > > to assemble as a partitionable array (so it will be /dev/md/d0 instead > > of /dev/md0. Hence the 'd'). > > I did exactly that on a server with /dev/sda and /dev/sdb (those are 2 > SATA disks through libata). Since /dev/sda contains the currently installed > system I marked it as failed in /etc/raidtab before creating the array. > > But it doesn't work. > > # cat /proc/cmdline > auto BOOT_IMAGE=Linux ro root=801 raid=partitionable md=d0,/dev/sda,/dev/sdb > # ls /sys/block > md0 sda sdb > # ls /sys/block/md0 > dev range size stat > # dmesg | grep md: > md: Will configure md0 (super-block) from /dev/sda,/dev/sdb, below. > md: raid1 personality registered as nr 3 > md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 > md: Autodetecting RAID arrays. > md: autorun ... > md: ... autorun DONE. > md: Loading md_d0: /dev/sda > md: invalid raid superblock magic on sda > md: sda has invalid sb, not importing! > md: md_import_device returned -22 > md: bind<sdb> > > Is it because the first disk is invalid ? That shouldn't happen, > right? Right. I missed a bit in the patch. (I assume you are still wanting to boot off /dev/sda until you copy the data into /dev/md/d0p* - then you will use root=/dev/md_d0p1) NeilBrown ----------- Diffstat output ------------ ./init/do_mounts_md.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletion(-) diff ./init/do_mounts_md.c~current~ ./init/do_mounts_md.c --- ./init/do_mounts_md.c~current~ 2004-02-26 10:15:07.000000000 +1100 +++ ./init/do_mounts_md.c 2004-02-27 11:20:14.000000000 +1100 @@ -134,7 +134,8 @@ static void __init md_setup_drive(void) sprintf(name, "/dev/md%s%d", partitioned?"_d":"", minor); sprintf(devfs_name, "/dev/md/%s%d", partitioned?"d":"", minor); - create_dev(name, MKDEV(MD_MAJOR, minor), devfs_name); + dev = name_to_dev_t(name); + create_dev(name, dev, devfs_name); for (i = 0; i < MD_SB_DISKS && devname != 0; i++) { char *p; char comp_name[64]; ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-02-27 0:21 ` Neil Brown @ 2004-02-27 1:17 ` Neil Brown 2004-02-27 16:56 ` Miquel van Smoorenburg 2004-03-09 16:46 ` Booting from partitioned raid, do_mounts_md.c patch " Miquel van Smoorenburg 0 siblings, 2 replies; 23+ messages in thread From: Neil Brown @ 2004-02-27 1:17 UTC (permalink / raw) To: Miquel van Smoorenburg, linux-raid On Friday February 27, neilb@cse.unsw.edu.au wrote: > > Right. I missed a bit in the patch. > (I assume you are still wanting to boot off /dev/sda until you copy > the data into /dev/md/d0p* - then you will use root=/dev/md_d0p1) Sorry, that patch was wrong. This one, ontop of the original patch, works for me (I finally got around to testing it). NeilBrown ----------- Diffstat output ------------ ./drivers/md/md.c | 2 +- ./init/do_mounts_md.c | 9 ++++++++- 2 files changed, 9 insertions(+), 2 deletions(-) diff ./drivers/md/md.c~current~ ./drivers/md/md.c --- ./drivers/md/md.c~current~ 2004-02-26 10:17:05.000000000 +1100 +++ ./drivers/md/md.c 2004-02-27 12:06:31.000000000 +1100 @@ -1450,7 +1450,7 @@ abort: return 1; } -static int mdp_major = 0; +int mdp_major = 0; static struct kobject *md_probe(dev_t dev, int *part, void *data) { diff ./init/do_mounts_md.c~current~ ./init/do_mounts_md.c --- ./init/do_mounts_md.c~current~ 2004-02-26 10:15:07.000000000 +1100 +++ ./init/do_mounts_md.c 2004-02-27 12:09:29.000000000 +1100 @@ -24,6 +24,7 @@ static struct { static int md_setup_ents __initdata; +extern int mdp_major; /* * Parse the command-line parameters given our kernel, but do not * actually try to invoke the MD device now; that is handled by @@ -115,6 +116,8 @@ static int __init md_setup(char *str) return 1; } +#define MdpMinorShift 6 + static void __init md_setup_drive(void) { int minor, i, ent, partitioned; @@ -134,7 +137,11 @@ static void __init md_setup_drive(void) sprintf(name, "/dev/md%s%d", partitioned?"_d":"", minor); sprintf(devfs_name, "/dev/md/%s%d", partitioned?"d":"", minor); - create_dev(name, MKDEV(MD_MAJOR, minor), devfs_name); + if (partitioned) + dev = MKDEV(mdp_major, minor << MdpMinorShift); + else + dev = MKDEV(MD_MAJOR, minor); + create_dev(name, dev, devfs_name); for (i = 0; i < MD_SB_DISKS && devname != 0; i++) { char *p; char comp_name[64]; ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-02-27 1:17 ` Neil Brown @ 2004-02-27 16:56 ` Miquel van Smoorenburg 2004-02-28 1:09 ` Miquel van Smoorenburg 2004-03-01 0:16 ` Partitioned raid and major number Neil Brown 2004-03-09 16:46 ` Booting from partitioned raid, do_mounts_md.c patch " Miquel van Smoorenburg 1 sibling, 2 replies; 23+ messages in thread From: Miquel van Smoorenburg @ 2004-02-27 16:56 UTC (permalink / raw) To: Neil Brown; +Cc: Miquel van Smoorenburg, linux-raid On 2004.02.27 02:17, Neil Brown wrote: > On Friday February 27, neilb@cse.unsw.edu.au wrote: > > > > Right. I missed a bit in the patch. > > (I assume you are still wanting to boot off /dev/sda until you copy > > the data into /dev/md/d0p* - then you will use root=/dev/md_d0p1) > > Sorry, that patch was wrong. > This one, ontop of the original patch, works for me (I finally got > around to testing it). Yes, it works! Great. Now how to enable RAID1 on an existing disk... I hoped that I could create an array with /dev/sda and /dev/sdb, with /dev/sdb marked as failed-disk. Because initializing a RAID1 array with just one working disk doesn't destroy the existing contents of the disk, right ? (I kept the last few MB of the disk as free space for the raid superblock). Unfortunately the current tools (or the kernel) doesn't let me do that (/dev/sda is busy). Two more minor issues - one, if partitioned MD is on (/dev/md/d0 etc) the standard /dev/md0 device doesn't work anymore. For accessing the whole device (management purposes / tools) shouldn't both /dev/md0 and /dev/md/d0 open the same device ? Two, shouldn't raid=partitionable md=d0,/dev/sda,/dev/sdb simply be md=d0,/dev/sda,/dev/sdb,partitionable ? You could even leave out the 'd' then and make it md=0,/dev/sda,/dev/sdb,partitionable. Together with (one) this would make a bit more sense. I hope to figure out how to migrate an existing 1-disk setup to RAID1 on a live machine over the weekend. Mike. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-02-27 16:56 ` Miquel van Smoorenburg @ 2004-02-28 1:09 ` Miquel van Smoorenburg 2004-02-28 7:27 ` Luca Berra 2004-03-01 0:54 ` Neil Brown 2004-03-01 0:16 ` Partitioned raid and major number Neil Brown 1 sibling, 2 replies; 23+ messages in thread From: Miquel van Smoorenburg @ 2004-02-28 1:09 UTC (permalink / raw) To: Neil Brown; +Cc: Miquel van Smoorenburg, linux-raid On Fri, 27 Feb 2004 17:56:14, Miquel van Smoorenburg wrote: > On 2004.02.27 02:17, Neil Brown wrote: > > On Friday February 27, neilb@cse.unsw.edu.au wrote: > > > > > > Right. I missed a bit in the patch. > > > (I assume you are still wanting to boot off /dev/sda until you copy > > > the data into /dev/md/d0p* - then you will use root=/dev/md_d0p1) > > > > Sorry, that patch was wrong. > > This one, ontop of the original patch, works for me (I finally got > > around to testing it). > > Yes, it works! Great. > > Now how to enable RAID1 on an existing disk... Hmm. With a dynamic major, the system might fail at checking the root file system at boot. At that time, /dev is still read-only, and /dev/md/d0p1 might not be the correct device yet. So either mdp needs its own partition number, or we need a /dev/root device that's an alias for the current root (like /dev/console). Fortunately, that's very easy. Which makes me wonder why this hasn't been done before .. what am I overlooking ? Patch below uses 4,1 which is just arbitrary, ofcourse. Comments ? --- linux-2.6.3/fs/block_dev.c 2004-02-18 04:59:58.000000000 +0100 +++ linux-2.6.3-bk8-mdp/fs/block_dev.c 2004-02-28 01:58:27.000000000 +0100 @@ -339,6 +339,11 @@ struct block_device *bdget(dev_t dev) struct block_device *bdev; struct inode *inode; +#if 1 /* XXX miquels */ + if (dev == MKDEV(4, 1)) + dev = current->fs->pwdmnt->mnt_sb->s_dev; +#endif + inode = iget5_locked(bd_mnt->mnt_sb, hash(dev), bdev_test, bdev_set, &dev); Mike. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-02-28 1:09 ` Miquel van Smoorenburg @ 2004-02-28 7:27 ` Luca Berra 2004-03-01 0:54 ` Neil Brown 1 sibling, 0 replies; 23+ messages in thread From: Luca Berra @ 2004-02-28 7:27 UTC (permalink / raw) To: linux-raid On Sat, Feb 28, 2004 at 02:09:16AM +0100, Miquel van Smoorenburg wrote: >Hmm. With a dynamic major, the system might fail at checking the root >file system at boot. At that time, /dev is still read-only, and >/dev/md/d0p1 might not be the correct device yet. > >So either mdp needs its own partition number, or we need a /dev/root >device that's an alias for the current root (like /dev/console). > >Fortunately, that's very easy. Which makes me wonder why this hasn't >been done before .. what am I overlooking ? > >Patch below uses 4,1 which is just arbitrary, ofcourse. Comments ? > I was missing this feature from linux, and i don't know why it was not done before... Having /dev/root also solves the similar problem of ppl whose root is on a device-mapper like myself. (dm has dynamic majors) Regards, L. -- Luca Berra -- bluca@comedia.it Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-02-28 1:09 ` Miquel van Smoorenburg 2004-02-28 7:27 ` Luca Berra @ 2004-03-01 0:54 ` Neil Brown 2004-03-01 1:04 ` Miquel van Smoorenburg ` (2 more replies) 1 sibling, 3 replies; 23+ messages in thread From: Neil Brown @ 2004-03-01 0:54 UTC (permalink / raw) To: Miquel van Smoorenburg; +Cc: linux-raid On Saturday February 28, miquels@cistron.nl wrote: > > Hmm. With a dynamic major, the system might fail at checking the root > file system at boot. At that time, /dev is still read-only, and > /dev/md/d0p1 might not be the correct device yet. > > So either mdp needs its own partition number, or we need a /dev/root > device that's an alias for the current root (like /dev/console). > Yes, I think this is a real problem. There are a number of avenues that could be followed to fix it. One it your suggestion. Another is to make "rootfs" remountable like this: --- ./fs/ramfs/inode.c~current~ 2004-03-01 11:20:58.000000000 +1100 +++ ./fs/ramfs/inode.c 2004-03-01 11:21:15.000000000 +1100 @@ -207,7 +207,7 @@ static struct super_block *ramfs_get_sb( static struct super_block *rootfs_get_sb(struct file_system_type *fs_type, int flags, const char *dev_name, void *data) { - return get_sb_nodev(fs_type, flags|MS_NOUSER, data, ramfs_fill_super); + return get_sb_single(fs_type, flags, data, ramfs_fill_super); } static struct file_system_type ramfs_fs_type = { And then: mount -t rootfs rootfs /mnt/root fsck /mnt/root/dev/root Another is to add "rootdev" to /proc/*, as in appended patch. Then ln -s /proc/self/roodev /dev/root and providing /proc is mounted, /dev/root will work. I think I prefer the /proc/self/rootdev approach despite it being the bigger patch. I might try to push it on linux-kernel. NeilBrown diff ./fs/proc/base.c~current~ ./fs/proc/base.c --- ./fs/proc/base.c~current~ 2004-03-01 11:28:24.000000000 +1100 +++ ./fs/proc/base.c 2004-03-01 11:48:07.000000000 +1100 @@ -50,6 +50,7 @@ enum pid_directory_inos { PROC_TGID_MEM, PROC_TGID_CWD, PROC_TGID_ROOT, + PROC_TGID_ROOTDEV, PROC_TGID_EXE, PROC_TGID_FD, PROC_TGID_ENVIRON, @@ -73,6 +74,7 @@ enum pid_directory_inos { PROC_TID_MEM, PROC_TID_CWD, PROC_TID_ROOT, + PROC_TID_ROOTDEV, PROC_TID_EXE, PROC_TID_FD, PROC_TID_ENVIRON, @@ -115,6 +117,7 @@ static struct pid_entry tgid_base_stuff[ E(PROC_TGID_MEM, "mem", S_IFREG|S_IRUSR|S_IWUSR), E(PROC_TGID_CWD, "cwd", S_IFLNK|S_IRWXUGO), E(PROC_TGID_ROOT, "root", S_IFLNK|S_IRWXUGO), + E(PROC_TGID_ROOTDEV, "rootdev", S_IFBLK|S_IRUSR|S_IWUSR), E(PROC_TGID_EXE, "exe", S_IFLNK|S_IRWXUGO), E(PROC_TGID_MOUNTS, "mounts", S_IFREG|S_IRUGO), #ifdef CONFIG_SECURITY @@ -137,6 +140,7 @@ static struct pid_entry tid_base_stuff[] E(PROC_TID_MEM, "mem", S_IFREG|S_IRUSR|S_IWUSR), E(PROC_TID_CWD, "cwd", S_IFLNK|S_IRWXUGO), E(PROC_TID_ROOT, "root", S_IFLNK|S_IRWXUGO), + E(PROC_TID_ROOTDEV, "rootdev", S_IFBLK|S_IRUSR|S_IWUSR), E(PROC_TID_EXE, "exe", S_IFLNK|S_IRWXUGO), E(PROC_TID_MOUNTS, "mounts", S_IFREG|S_IRUGO), #ifdef CONFIG_SECURITY @@ -771,6 +775,32 @@ static struct inode_operations proc_pid_ .follow_link = proc_pid_follow_link }; +int proc_pid_get_attr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat) +{ + struct inode *inode = dentry->d_inode; + struct fs_struct *fs; + int result = -ENOENT; + generic_fillattr(inode, stat); + task_lock(proc_task(inode)); + fs = proc_task(inode)->fs; + if(fs) + atomic_inc(&fs->count); + task_unlock(proc_task(inode)); + if (fs) { + read_lock(&fs->lock); + stat->rdev = fs->pwdmnt->mnt_sb->s_dev; + read_unlock(&fs->lock); + result = 0; + put_fs_struct(fs); + } + + return result; +} + +static struct inode_operations proc_pid_dev_inode_operations = { + .getattr = proc_pid_get_attr, +}; + static int pid_alive(struct task_struct *p) { BUG_ON(p->pids[PIDTYPE_PID].pidptr != &p->pids[PIDTYPE_PID].pid); @@ -1319,6 +1349,10 @@ static struct dentry *proc_pident_lookup inode->i_op = &proc_pid_link_inode_operations; ei->op.proc_get_link = proc_root_link; break; + case PROC_TID_ROOTDEV: + case PROC_TGID_ROOTDEV: + inode->i_op = &proc_pid_dev_inode_operations; + break; case PROC_TID_ENVIRON: case PROC_TGID_ENVIRON: inode->i_fop = &proc_info_file_operations; ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-03-01 0:54 ` Neil Brown @ 2004-03-01 1:04 ` Miquel van Smoorenburg 2004-03-02 0:36 ` H. Peter Anvin 2004-03-01 15:38 ` Miquel van Smoorenburg 2004-03-09 15:34 ` /dev/root (was: Re: Partitioned raid and major number) Miquel van Smoorenburg 2 siblings, 1 reply; 23+ messages in thread From: Miquel van Smoorenburg @ 2004-03-01 1:04 UTC (permalink / raw) To: Neil Brown; +Cc: Miquel van Smoorenburg, linux-raid On Mon, 01 Mar 2004 01:54:29, Neil Brown wrote: > On Saturday February 28, miquels@cistron.nl wrote: > > > > Hmm. With a dynamic major, the system might fail at checking the root > > file system at boot. At that time, /dev is still read-only, and > > /dev/md/d0p1 might not be the correct device yet. > > > > So either mdp needs its own partition number, or we need a /dev/root > > device that's an alias for the current root (like /dev/console). > > > > Yes, I think this is a real problem. > There are a number of avenues that could be followed to fix it. > One it your suggestion. That was a quick hack and as I said, I think it's flawed :) > Another is to make "rootfs" remountable like this: POSIX allows different semantics for / and //, I mentioned before that perhaps we should make "cd //" chdir to rootfs instead of /. Then you can also have //proc and //sys without explicitly mounting them. But I don't think the time has come for that yet (besides it needs more thought wrt namespaces, chroot etc). > Another is to add "rootdev" to /proc/*, as in appended patch. Then > ln -s /proc/self/roodev /dev/root > > and providing /proc is mounted, /dev/root will work. I like that approach. Mike. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-03-01 1:04 ` Miquel van Smoorenburg @ 2004-03-02 0:36 ` H. Peter Anvin 0 siblings, 0 replies; 23+ messages in thread From: H. Peter Anvin @ 2004-03-02 0:36 UTC (permalink / raw) To: linux-raid Followup to: <20040301010419.GK14194@drinkel.cistron.nl> By author: Miquel van Smoorenburg <miquels@cistron.nl> In newsgroup: linux.dev.raid > > POSIX allows different semantics for / and //, I mentioned before that > perhaps we should make "cd //" chdir to rootfs instead of /. Then > you can also have //proc and //sys without explicitly mounting them. > But I don't think the time has come for that yet (besides it needs > more thought wrt namespaces, chroot etc). > It's allowed, but definitely not recommended. I do like the idea of making rootfs remountable, though. > > Another is to add "rootdev" to /proc/*, as in appended patch. Then > > ln -s /proc/self/roodev /dev/root > > > > and providing /proc is mounted, /dev/root will work. > > I like that approach. .. assuming the rootfs isn't a nodev filesystem. -hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-03-01 0:54 ` Neil Brown 2004-03-01 1:04 ` Miquel van Smoorenburg @ 2004-03-01 15:38 ` Miquel van Smoorenburg 2004-03-09 15:34 ` /dev/root (was: Re: Partitioned raid and major number) Miquel van Smoorenburg 2 siblings, 0 replies; 23+ messages in thread From: Miquel van Smoorenburg @ 2004-03-01 15:38 UTC (permalink / raw) To: Neil Brown; +Cc: Miquel van Smoorenburg, linux-raid On 2004.03.01 01:54, Neil Brown wrote: > Another is to add "rootdev" to /proc/*, as in appended patch. Then > ln -s /proc/self/roodev /dev/root > > and providing /proc is mounted, /dev/root will work. > > I think I prefer the /proc/self/rootdev approach despite it being the > bigger patch. > > I might try to push it on linux-kernel. It doesn't work. Here's a version that does: --- linux-2.6.3/fs/proc/base.c 2004-02-18 04:58:32.000000000 +0100 +++ linux-2.6.3-bk8-mdp/fs/proc/base.c 2004-03-01 15:20:22.000000000 +0100 @@ -50,6 +50,7 @@ PROC_TGID_MEM, PROC_TGID_CWD, PROC_TGID_ROOT, + PROC_TGID_ROOTDEV, PROC_TGID_EXE, PROC_TGID_FD, PROC_TGID_ENVIRON, @@ -73,6 +74,7 @@ PROC_TID_MEM, PROC_TID_CWD, PROC_TID_ROOT, + PROC_TID_ROOTDEV, PROC_TID_EXE, PROC_TID_FD, PROC_TID_ENVIRON, @@ -115,6 +117,7 @@ E(PROC_TGID_MEM, "mem", S_IFREG|S_IRUSR|S_IWUSR), E(PROC_TGID_CWD, "cwd", S_IFLNK|S_IRWXUGO), E(PROC_TGID_ROOT, "root", S_IFLNK|S_IRWXUGO), + E(PROC_TGID_ROOTDEV, "rootdev", S_IFBLK|S_IRUSR|S_IWUSR), E(PROC_TGID_EXE, "exe", S_IFLNK|S_IRWXUGO), E(PROC_TGID_MOUNTS, "mounts", S_IFREG|S_IRUGO), #ifdef CONFIG_SECURITY @@ -137,6 +140,7 @@ E(PROC_TID_MEM, "mem", S_IFREG|S_IRUSR|S_IWUSR), E(PROC_TID_CWD, "cwd", S_IFLNK|S_IRWXUGO), E(PROC_TID_ROOT, "root", S_IFLNK|S_IRWXUGO), + E(PROC_TID_ROOTDEV, "rootdev", S_IFBLK|S_IRUSR|S_IWUSR), E(PROC_TID_EXE, "exe", S_IFLNK|S_IRWXUGO), E(PROC_TID_MOUNTS, "mounts", S_IFREG|S_IRUGO), #ifdef CONFIG_SECURITY @@ -771,6 +775,32 @@ .follow_link = proc_pid_follow_link }; +static int init_pid_rootdev_inode(struct inode *inode) +{ + struct fs_struct *fs; + struct vfsmount *vmnt; + int result = -ENOENT; + dev_t rootdev = 0; + + task_lock(proc_task(inode)); + fs = proc_task(inode)->fs; + if(fs) + atomic_inc(&fs->count); + task_unlock(proc_task(inode)); + if (fs) { + read_lock(&fs->lock); + vmnt = mntget(fs->rootmnt); + rootdev = vmnt->mnt_sb->s_dev; + mntput(vmnt); + read_unlock(&fs->lock); + result = 0; + put_fs_struct(fs); + } + init_special_inode(inode, inode->i_mode, rootdev); + + return result; +} + static int pid_alive(struct task_struct *p) { BUG_ON(p->pids[PIDTYPE_PID].pidptr != &p->pids[PIDTYPE_PID].pid); @@ -958,7 +988,9 @@ ei->type = ino; inode->i_uid = 0; inode->i_gid = 0; - if (ino == PROC_TGID_INO || ino == PROC_TID_INO || task_dumpable(task)) { + if (ino != PROC_TGID_ROOTDEV && ino != PROC_TID_ROOTDEV && + (ino == PROC_TGID_INO || ino == PROC_TID_INO || + task_dumpable(task))) { inode->i_uid = task->euid; inode->i_gid = task->egid; } @@ -988,7 +1020,10 @@ struct inode *inode = dentry->d_inode; struct task_struct *task = proc_task(inode); if (pid_alive(task)) { - if (proc_type(inode) == PROC_TGID_INO || proc_type(inode) == PROC_TID_INO || task_dumpable(task)) { + int ino = proc_type(inode); + if (ino != PROC_TGID_ROOTDEV && ino != PROC_TID_ROOTDEV && + (ino == PROC_TGID_INO || ino == PROC_TID_INO || + task_dumpable(task))) { inode->i_uid = task->euid; inode->i_gid = task->egid; } else { @@ -1319,6 +1354,10 @@ inode->i_op = &proc_pid_link_inode_operations; ei->op.proc_get_link = proc_root_link; break; + case PROC_TID_ROOTDEV: + case PROC_TGID_ROOTDEV: + init_pid_rootdev_inode(inode); + break; case PROC_TID_ENVIRON: case PROC_TGID_ENVIRON: inode->i_fop = &proc_info_file_operations; ^ permalink raw reply [flat|nested] 23+ messages in thread
* /dev/root (was: Re: Partitioned raid and major number) 2004-03-01 0:54 ` Neil Brown 2004-03-01 1:04 ` Miquel van Smoorenburg 2004-03-01 15:38 ` Miquel van Smoorenburg @ 2004-03-09 15:34 ` Miquel van Smoorenburg 2004-03-10 2:05 ` Neil Brown 2 siblings, 1 reply; 23+ messages in thread From: Miquel van Smoorenburg @ 2004-03-09 15:34 UTC (permalink / raw) To: Neil Brown; +Cc: Miquel van Smoorenburg, linux-raid On 2004.03.01 01:54, Neil Brown wrote: > On Saturday February 28, miquels@cistron.nl wrote: > > > > Hmm. With a dynamic major, the system might fail at checking the root > > file system at boot. At that time, /dev is still read-only, and > > /dev/md/d0p1 might not be the correct device yet. > > > > So either mdp needs its own partition number, or we need a /dev/root > > device that's an alias for the current root (like /dev/console). > > > > Yes, I think this is a real problem. > There are a number of avenues that could be followed to fix it. > One it your suggestion. > > Another is to make "rootfs" remountable > > And then: > > mount -t rootfs rootfs /mnt/root > fsck /mnt/root/dev/root > > Another is to add "rootdev" to /proc/*, as in appended patch. Then > ln -s /proc/self/roodev /dev/root > > and providing /proc is mounted, /dev/root will work. > > I might try to push it on linux-kernel. Mind if I post all three approaches (/dev/root alias device, rootfs, /proc/pid/root) to linux-kernel and ask input on what approach is the preferred one ? Mike. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: /dev/root (was: Re: Partitioned raid and major number) 2004-03-09 15:34 ` /dev/root (was: Re: Partitioned raid and major number) Miquel van Smoorenburg @ 2004-03-10 2:05 ` Neil Brown 0 siblings, 0 replies; 23+ messages in thread From: Neil Brown @ 2004-03-10 2:05 UTC (permalink / raw) To: Miquel van Smoorenburg; +Cc: linux-raid On Tuesday March 9, miquels@cistron.nl wrote: > > Mind if I post all three approaches (/dev/root alias device, rootfs, /proc/pid/root) > to linux-kernel and ask input on what approach is the preferred one ? Go right ahead. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-02-27 16:56 ` Miquel van Smoorenburg 2004-02-28 1:09 ` Miquel van Smoorenburg @ 2004-03-01 0:16 ` Neil Brown 2004-03-01 0:42 ` Miquel van Smoorenburg 1 sibling, 1 reply; 23+ messages in thread From: Neil Brown @ 2004-03-01 0:16 UTC (permalink / raw) To: Miquel van Smoorenburg; +Cc: linux-raid On Friday February 27, miquels@cistron.nl wrote: > On 2004.02.27 02:17, Neil Brown wrote: > > On Friday February 27, neilb@cse.unsw.edu.au wrote: > > > > > > Right. I missed a bit in the patch. > > > (I assume you are still wanting to boot off /dev/sda until you copy > > > the data into /dev/md/d0p* - then you will use root=/dev/md_d0p1) > > > > Sorry, that patch was wrong. > > This one, ontop of the original patch, works for me (I finally got > > around to testing it). > > Yes, it works! Great. > > Now how to enable RAID1 on an existing disk... I hoped that I could create > an array with /dev/sda and /dev/sdb, with /dev/sdb marked as failed-disk. > Because initializing a RAID1 array with just one working disk doesn't > destroy the existing contents of the disk, right ? (I kept the last few > MB of the disk as free space for the raid superblock). > > Unfortunately the current tools (or the kernel) doesn't let me do that > (/dev/sda is busy). Yep. Currently it isn't really possible while sda is mounted. You need to boot off some other media and create the array there. You are right: creating a raid1 with just one working disk doesn't destroy existing contents - except for last 128K or so. > > Two more minor issues - one, if partitioned MD is on (/dev/md/d0 etc) > the standard /dev/md0 device doesn't work anymore. For accessing the whole > device (management purposes / tools) shouldn't both /dev/md0 and > /dev/md/d0 open the same device ? No - they are completely different devices. Making them appear as one device has all sorts of problems relating to confusing bits of code that think they have exclusive access. The idea as appealing but wasn't worth the effort. > > Two, shouldn't raid=partitionable md=d0,/dev/sda,/dev/sdb simply be > md=d0,/dev/sda,/dev/sdb,partitionable ? You could even leave out the 'd' > then and make it md=0,/dev/sda,/dev/sdb,partitionable. Together with > (one) this would make a bit more sense. no. raid=partitionable and md=d0,..... are related, but mean substantially different things. You normally only need one of these, not both. raid=partitionable means that any auto-detected (using partition type 0xfd) arrays are assembled as partitionable. This does not apply to you at all because you are not making arrays out of partitions. md=d0,...... assembles an array from explicit devices and a partitionable array. NeilBrown ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-03-01 0:16 ` Partitioned raid and major number Neil Brown @ 2004-03-01 0:42 ` Miquel van Smoorenburg 2004-03-01 1:09 ` Neil Brown 0 siblings, 1 reply; 23+ messages in thread From: Miquel van Smoorenburg @ 2004-03-01 0:42 UTC (permalink / raw) To: Neil Brown; +Cc: Miquel van Smoorenburg, linux-raid On Mon, 01 Mar 2004 01:16:03, Neil Brown wrote: > On Friday February 27, miquels@cistron.nl wrote: > > Now how to enable RAID1 on an existing disk... I hoped that I could create > > an array with /dev/sda and /dev/sdb, with /dev/sdb marked as failed-disk. > > Because initializing a RAID1 array with just one working disk doesn't > > destroy the existing contents of the disk, right ? (I kept the last few > > MB of the disk as free space for the raid superblock). > > > > Unfortunately the current tools (or the kernel) doesn't let me do that > > (/dev/sda is busy). > > Yep. Currently it isn't really possible while sda is mounted. > You need to boot off some other media and create the array there. I hacked on it this weekend. I added a SET_ARRAY_INFO_CONFONLY ioctl that creates an mddev, but markes it "confonly" internally. Meaning you can configure it and add disks to it, but it can't be started/run. The confonly array doesn't lock the disks when you add_new_disk(). I also patched raidtools2 to accept --confonly to mkraid, which uses this new functionality. And that allows me to create a new raid array on a disk that is currently in-use (you still have to use --really-force, ofcourse). What do you think of that approach ? Converting from a 1 disk setup to a 2-disk RAID1 setup on an existing system is something that lots of people want to do, seeing that the software raid howto even has a few paragraphs dedicated to it. Though it works, and I can boot from it, lilo doesn't understand it yet so I'll have to hack on that next. But if it works it would probably eventually be possible to add ICH5-R etc raid1 superblock support to it. Or just write a valid ICH5-R raid1 superblock to the disk (hopefully at another offset) so that the BIOS knows this is a RAID1 setup and can boot when sda/hda is dead. > > Two more minor issues - one, if partitioned MD is on (/dev/md/d0 etc) > > the standard /dev/md0 device doesn't work anymore. For accessing the whole > > device (management purposes / tools) shouldn't both /dev/md0 and > > /dev/md/d0 open the same device ? > > No - they are completely different devices. Making them appear as one > device has all sorts of problems relating to confusing bits of code > that think they have exclusive access. The idea as appealing but > wasn't worth the effort. Hmm yes, I understand. Thanks for explaining all that. BTW, if you want to boot from a partitionable raid, you need the /dev/root patch I posted before or you can't check the root filesystem. That patch I think will not be accepted since stat(/dev/root) and fd=open(/dev/root);fstat(fd) will return different things which is inconsequent. How do you feel about applying for a static device number for partitioned raid ? Hpa is also on this list, I noticed, and from his reaction I think it wouldn't be a problem. Also it would be easier for bootloaders like LILO to detect and deal with this. Besides, if you check the current devices.txt you'll see that although we've almost run out of majors that's only true for character devices. There's plenty, plenty of block majors left. Mike. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number 2004-03-01 0:42 ` Miquel van Smoorenburg @ 2004-03-01 1:09 ` Neil Brown 2004-03-09 15:32 ` Creating partitionable raid on existing disk (was: Re: Partitioned raid and major number) Miquel van Smoorenburg 0 siblings, 1 reply; 23+ messages in thread From: Neil Brown @ 2004-03-01 1:09 UTC (permalink / raw) To: Miquel van Smoorenburg; +Cc: linux-raid On Monday March 1, miquels@cistron.nl wrote: > > What do you think of that approach ? Converting from a 1 disk setup > to a 2-disk RAID1 setup on an existing system is something that lots > of people want to do, seeing that the software raid howto even has > a few paragraphs dedicated to it. What I am thinking of doing is allowing: md=0,1,/dev/sda to assemble a raid1 array without a superblock which uses just /dev/sda. Then md=d0,1,/dev/sda root=/dev/md_d0p1 would boot off md/d0p1 instead of sda1, but it would be the same data. Then you would be able to add mirrors to this with something like: mdadm --grow /dev/md/d0 --disks=2 mdadm /dev/md/d0 --add /dev/sdb and you could convert it into an array with a persistent superblock using: mdadm --grow /dev/md/d0 --persistent=yes The only difficult bit is the setting a persistent superblock means reducing the size of the device, and I would like it to be hard to do that in error, but not impossible to do it. > > Though it works, and I can boot from it, lilo doesn't understand it yet > so I'll have to hack on that next. I have lilo working well with partitioned raid in 2.4. I have a stanza in /etc/lilo.conf like: boot=/dev/Mda disk=/dev/Mda bios=0x80 sectors=63 heads=255 cylinders=1024 partition=/dev/md/d0p1 start=1 where /dev/Mda is a symlink to /dev/md/d0, because lilo thinks it understands device names that start "/dev/md". The "start=" number is fairly important - lilo cannot or does not figure this out itself, so you have to tell it. It is the start of /dev/md/d0p1 in /dev/md/d0. > > But if it works it would probably eventually be possible to add ICH5-R > etc raid1 superblock support to it. Or just write a valid ICH5-R > raid1 superblock to the disk (hopefully at another offset) so that the > BIOS knows this is a RAID1 setup and can boot when sda/hda is dead. Not knowing anything about ICH5-R superblocks, I cannot comment, but I would like to be able to support multiple superblock formats. > > BTW, if you want to boot from a partitionable raid, you need the /dev/root > patch I posted before or you can't check the root filesystem. That patch > I think will not be accepted since stat(/dev/root) and > fd=open(/dev/root);fstat(fd) will return different things which is > inconsequent. How do you feel about applying for a static device number > for partitioned raid ? Hpa is also on this list, I noticed, and from his > reaction I think it wouldn't be a problem. Also it would be easier for > bootloaders like LILO to detect and deal with this. > > Besides, if you check the current devices.txt you'll see that although > we've almost run out of majors that's only true for character devices. > There's plenty, plenty of block majors left. I realise that we could get a major allocated, but I would rather not. As there seems to be a push for dynamic device numbers, I would like to ride with it and find out all the implications. NeilBrown ^ permalink raw reply [flat|nested] 23+ messages in thread
* Creating partitionable raid on existing disk (was: Re: Partitioned raid and major number) 2004-03-01 1:09 ` Neil Brown @ 2004-03-09 15:32 ` Miquel van Smoorenburg 2004-03-10 2:41 ` Neil Brown 0 siblings, 1 reply; 23+ messages in thread From: Miquel van Smoorenburg @ 2004-03-09 15:32 UTC (permalink / raw) To: Neil Brown; +Cc: Miquel van Smoorenburg, linux-raid On 2004.03.01 02:09, Neil Brown wrote: > On Monday March 1, miquels@cistron.nl wrote: > > > > What do you think of that approach ? Converting from a 1 disk setup > > to a 2-disk RAID1 setup on an existing system is something that lots > > of people want to do, seeing that the software raid howto even has > > a few paragraphs dedicated to it. > > What I am thinking of doing is allowing: > > md=0,1,/dev/sda > > to assemble a raid1 array without a superblock which uses just > /dev/sda. > Then > > md=d0,1,/dev/sda root=/dev/md_d0p1 > > would boot off md/d0p1 instead of sda1, but it would be the same data. > > Then you would be able to add mirrors to this with something like: > > mdadm --grow /dev/md/d0 --disks=2 > mdadm /dev/md/d0 --add /dev/sdb > > and you could convert it into an array with a persistent superblock > using: > mdadm --grow /dev/md/d0 --persistent=yes > > The only difficult bit is the setting a persistent superblock means > reducing the size of the device, and I would like it to be hard to do > that in error, but not impossible to do it. Any progress on this? The right thing to do would probably be to check if the device actually has partitions - refuse to reduce the size of the device if it doesn't have any partitions, or if any partition overlaps with the MD superblock. That should be easy enough I think. But it would take changes in both userlevel tools (add --grow to mdadm) and the kernel, right ? So in the short run, I'll be better off by booting a Knoppix CD and running mdadm from there, I suppose. Are you interested in the patch I made to be able to initialize (but not run!) an array on a disk that is otherwise busy (i.e. can't be bd_claim()ed) ? Basically it just adds a SET_ARRAY_INFO_CONFONLY ioctl. I have patches for both the kernel and mkraid, and they're pretty simple. Mike. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Creating partitionable raid on existing disk (was: Re: Partitioned raid and major number) 2004-03-09 15:32 ` Creating partitionable raid on existing disk (was: Re: Partitioned raid and major number) Miquel van Smoorenburg @ 2004-03-10 2:41 ` Neil Brown 0 siblings, 0 replies; 23+ messages in thread From: Neil Brown @ 2004-03-10 2:41 UTC (permalink / raw) To: Miquel van Smoorenburg; +Cc: linux-raid On Tuesday March 9, miquels@cistron.nl wrote: > On 2004.03.01 02:09, Neil Brown wrote: > > On Monday March 1, miquels@cistron.nl wrote: > > > > > > What do you think of that approach ? Converting from a 1 disk setup > > > to a 2-disk RAID1 setup on an existing system is something that lots > > > of people want to do, seeing that the software raid howto even has > > > a few paragraphs dedicated to it. > > > > What I am thinking of doing is allowing: > > > > md=0,1,/dev/sda > > > > to assemble a raid1 array without a superblock which uses just > > /dev/sda. ... > > Any progress on this? No, not yet. > > The right thing to do would probably be to check if the device actually has > partitions - refuse to reduce the size of the device if it doesn't have > any partitions, or if any partition overlaps with the MD superblock. > That should be easy enough I think. Certainly do-able. But I'm not sure it is completely correct. Suppose the device isn't partitioned, and has a single filesystem on it, which is smaller then the whole. Why not shrink it then? I really want some general interface to find out how much of a drive is in-use. Maybe if "bd_claim" to a parameter which said how much was being claimed..... Or maybe claimants should reducde i_size of the block device. > > But it would take changes in both userlevel tools (add --grow to mdadm) and > the kernel, right ? So in the short run, I'll be better off by booting a > Knoppix CD and running mdadm from there, I suppose. Yes, there is a bit of work to be done before you can use this approach. > > Are you interested in the patch I made to be able to initialize (but not run!) > an array on a disk that is otherwise busy (i.e. can't be bd_claim()ed) ? > Basically it just adds a SET_ARRAY_INFO_CONFONLY ioctl. I have patches for > both the kernel and mkraid, and they're pretty simple. Not really. If it just writes out a superblock, then it can be done entirely in user-space - no kernel patch should be needed. NeilBrown ^ permalink raw reply [flat|nested] 23+ messages in thread
* Booting from partitioned raid, do_mounts_md.c patch (was: Re: Partitioned raid and major number) 2004-02-27 1:17 ` Neil Brown 2004-02-27 16:56 ` Miquel van Smoorenburg @ 2004-03-09 16:46 ` Miquel van Smoorenburg 2004-03-10 2:36 ` Neil Brown 1 sibling, 1 reply; 23+ messages in thread From: Miquel van Smoorenburg @ 2004-03-09 16:46 UTC (permalink / raw) To: Neil Brown; +Cc: Miquel van Smoorenburg, linux-raid On 2004.02.27 02:17, Neil Brown wrote: > On Friday February 27, neilb@cse.unsw.edu.au wrote: > > > > Right. I missed a bit in the patch. > > (I assume you are still wanting to boot off /dev/sda until you copy > > the data into /dev/md/d0p* - then you will use root=/dev/md_d0p1) > > Sorry, that patch was wrong. > This one, ontop of the original patch, works for me (I finally got > around to testing it). > > > ----------- Diffstat output ------------ > ./drivers/md/md.c | 2 +- > ./init/do_mounts_md.c | 9 ++++++++- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff ./drivers/md/md.c~current~ ./drivers/md/md.c > diff ./init/do_mounts_md.c~current~ ./init/do_mounts_md.c I've tested this a lot as well, and it works fine. On my supermicro 1U test machine I can now pull out one of the two disks, and the machine still boots. I can even take out disk2, insert disk1 in the slot for disk2 and it still boots. Which is pretty cool ;) I didn't see this in -rc2-mm1 yet - is this going to be submitted to -mm soon ? Mike. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Booting from partitioned raid, do_mounts_md.c patch (was: Re: Partitioned raid and major number) 2004-03-09 16:46 ` Booting from partitioned raid, do_mounts_md.c patch " Miquel van Smoorenburg @ 2004-03-10 2:36 ` Neil Brown 0 siblings, 0 replies; 23+ messages in thread From: Neil Brown @ 2004-03-10 2:36 UTC (permalink / raw) To: Andrew Morton, Miquel van Smoorenburg; +Cc: linux-raid On Tuesday March 9, miquels@cistron.nl wrote: > > > > Sorry, that patch was wrong. > > This one, ontop of the original patch, works for me (I finally got > > around to testing it). > > > > > > ----------- Diffstat output ------------ > > ./drivers/md/md.c | 2 +- > > ./init/do_mounts_md.c | 9 ++++++++- > > 2 files changed, 9 insertions(+), 2 deletions(-) > > > > diff ./drivers/md/md.c~current~ ./drivers/md/md.c > > diff ./init/do_mounts_md.c~current~ ./init/do_mounts_md.c > > I've tested this a lot as well, and it works fine. On my supermicro 1U test > machine I can now pull out one of the two disks, and the machine still boots. > I can even take out disk2, insert disk1 in the slot for disk2 and it > still boots. Which is pretty cool ;) > > I didn't see this in -rc2-mm1 yet - is this going to be submitted to -mm soon ? > Thanks for reminding me. Andrew: please include this patch which completes the change started by "md-array-assembly-fix". Thanks, NeilBrown ---------------------------------------------------- Make sure correct major is used when assembling partitioned md arrays from boot parameters. We need to make mdp_major available to do_mounts_md.c, and use it there. ----------- Diffstat output ------------ ./drivers/md/md.c | 2 +- ./init/do_mounts_md.c | 9 ++++++++- 2 files changed, 9 insertions(+), 2 deletions(-) diff ./drivers/md/md.c~current~ ./drivers/md/md.c --- ./drivers/md/md.c~current~ 2004-02-28 11:19:17.000000000 +1100 +++ ./drivers/md/md.c 2004-02-28 11:19:17.000000000 +1100 @@ -1450,7 +1450,7 @@ abort: return 1; } -static int mdp_major = 0; +int mdp_major = 0; static struct kobject *md_probe(dev_t dev, int *part, void *data) { diff ./init/do_mounts_md.c~current~ ./init/do_mounts_md.c --- ./init/do_mounts_md.c~current~ 2004-02-28 11:19:17.000000000 +1100 +++ ./init/do_mounts_md.c 2004-02-28 11:19:17.000000000 +1100 @@ -24,6 +24,7 @@ static struct { static int md_setup_ents __initdata; +extern int mdp_major; /* * Parse the command-line parameters given our kernel, but do not * actually try to invoke the MD device now; that is handled by @@ -115,6 +116,8 @@ static int __init md_setup(char *str) return 1; } +#define MdpMinorShift 6 + static void __init md_setup_drive(void) { int minor, i, ent, partitioned; @@ -134,7 +137,11 @@ static void __init md_setup_drive(void) sprintf(name, "/dev/md%s%d", partitioned?"_d":"", minor); sprintf(devfs_name, "/dev/md/%s%d", partitioned?"d":"", minor); - create_dev(name, MKDEV(MD_MAJOR, minor), devfs_name); + if (partitioned) + dev = MKDEV(mdp_major, minor << MdpMinorShift); + else + dev = MKDEV(MD_MAJOR, minor); + create_dev(name, dev, devfs_name); for (i = 0; i < MD_SB_DISKS && devname != 0; i++) { char *p; char comp_name[64]; ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Partitioned raid and major number
@ 2004-02-26 23:01 Michael
0 siblings, 0 replies; 23+ messages in thread
From: Michael @ 2004-02-26 23:01 UTC (permalink / raw)
To: linux-raid
<snip>
> That's just a suggestion. Feel free to completely ignore it (as you
> probably will since dynamically allocated majors, initrd and udev
> are the future some people say ..)
>
<snip>
Well.... :-(
Back at raid v 0.5x, initrd was the only way you could start and stop
root mounted raid and it was a huge pain in the butt. Modifying the
initrd file every time you must make a change to some little thing is
no fun at all. I for one am very fond of partition type "FD".
Michael
Michael@Insulin-Pumpers.org
Michael@Insulin-Pumpers.org
^ permalink raw reply [flat|nested] 23+ messages in threadend of thread, other threads:[~2004-03-10 2:41 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-02-25 14:56 Partitioned raid and major number Miquel van Smoorenburg 2004-02-25 18:46 ` H. Peter Anvin 2004-02-25 23:25 ` Neil Brown 2004-02-26 21:51 ` Miquel van Smoorenburg 2004-02-27 0:21 ` Neil Brown 2004-02-27 1:17 ` Neil Brown 2004-02-27 16:56 ` Miquel van Smoorenburg 2004-02-28 1:09 ` Miquel van Smoorenburg 2004-02-28 7:27 ` Luca Berra 2004-03-01 0:54 ` Neil Brown 2004-03-01 1:04 ` Miquel van Smoorenburg 2004-03-02 0:36 ` H. Peter Anvin 2004-03-01 15:38 ` Miquel van Smoorenburg 2004-03-09 15:34 ` /dev/root (was: Re: Partitioned raid and major number) Miquel van Smoorenburg 2004-03-10 2:05 ` Neil Brown 2004-03-01 0:16 ` Partitioned raid and major number Neil Brown 2004-03-01 0:42 ` Miquel van Smoorenburg 2004-03-01 1:09 ` Neil Brown 2004-03-09 15:32 ` Creating partitionable raid on existing disk (was: Re: Partitioned raid and major number) Miquel van Smoorenburg 2004-03-10 2:41 ` Neil Brown 2004-03-09 16:46 ` Booting from partitioned raid, do_mounts_md.c patch " Miquel van Smoorenburg 2004-03-10 2:36 ` Neil Brown -- strict thread matches above, loose matches on Subject: below -- 2004-02-26 23:01 Partitioned raid and major number Michael
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).