linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ATARAID userspace configuration tool
@ 2004-02-10 14:18 Thomas Horsten
  2004-02-10 14:51 ` Matt Domsch
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Thomas Horsten @ 2004-02-10 14:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid

Hi,

I'm writing a userspace utility to detect/configure Medley (and later
other) ataraid devices in 2.6.

It's intended to run from initramfs (or initrd for those who use that).

I have a couple of questions and requests for clarification.

- Is there a "recommended" way to enumerate all block devices (not
partitions) from userside? Since this is ATA RAID, I could of course just
read the ideX majors from /proc/devices and try all the minors, but I
would prefer to get a list of all detected block devices in a portable
way.

- After I have used the DM (and possible MD for some RAID types) to map
the ataraid devices, is there a way to remove the partitions from the
underlying disks from the kernel? This was my main reason for wanting to
do kernel-level autodetection of these arrays, so I could prevent add_disk
from being called and analysing the partition table (on these BIOS RAIDs,
in striped mode the first disk contains the partition table for the entire
array in sector 0, and if the user (or a script) tries to mount the
partitions (or even read the extended partition table) it may try to read
after the end of the disk and will in any case use wrong sector numbers -
leading to possible disk corruption.

On top of this it would be useful to make the underlying devices
inaccessible after the mapped device is created (to prevent people from
doing things like fdisk /dev/hda, when what they really wanted was
something like fdisk /dev/ataraid/disc).

Detecting the partition table in userspace would fix this, but it's not
planned before 2.7 and I don't think it is safe to leave the false
partitions exposed.

- Some RAID types will need (I think) to use the MD framework as well as
DM (e.g. RAID0+1), so the device the users would be the md device which
would be composed of two dm devices. Is there a way to hide the underlying
dm devices from the user so he they only see the ones they should use (or
prevent these from being used directly some other way)?

// Thomas


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-10 14:18 ATARAID userspace configuration tool Thomas Horsten
@ 2004-02-10 14:51 ` Matt Domsch
  2004-02-10 14:58 ` Christophe Saout
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Matt Domsch @ 2004-02-10 14:51 UTC (permalink / raw)
  To: Thomas Horsten; +Cc: linux-kernel, linux-raid

On Tue, Feb 10, 2004 at 02:18:15PM +0000, Thomas Horsten wrote:
> - After I have used the DM (and possible MD for some RAID types) to map
> the ataraid devices, is there a way to remove the partitions from the
> underlying disks from the kernel?
> Detecting the partition table in userspace would fix this, but it's not
> planned before 2.7 and I don't think it is safe to leave the false
> partitions exposed.

partx, part of util-linux, can do this in userspace today.

Thanks,
Matt

-- 
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-10 14:18 ATARAID userspace configuration tool Thomas Horsten
  2004-02-10 14:51 ` Matt Domsch
@ 2004-02-10 14:58 ` Christophe Saout
  2004-02-10 18:26   ` Kevin P. Fleming
  2004-02-10 17:38 ` Jeff Garzik
  2004-02-10 22:41 ` Neil Brown
  3 siblings, 1 reply; 15+ messages in thread
From: Christophe Saout @ 2004-02-10 14:58 UTC (permalink / raw)
  To: Thomas Horsten; +Cc: linux-kernel, linux-raid

Am Di, den 10.02.2004 schrieb Thomas Horsten um 15:18:

> - Is there a "recommended" way to enumerate all block devices (not
> partitions) from userside? Since this is ATA RAID, I could of course just
> read the ideX majors from /proc/devices and try all the minors, but I
> would prefer to get a list of all detected block devices in a portable
> way.

You could go through the block devices in /sys and check if it is
attached to a pci card from one of the ataraid vendors...?

> - After I have used the DM (and possible MD for some RAID types) to map
> the ataraid devices, is there a way to remove the partitions from the
> underlying disks from the kernel?

Nope.

> This was my main reason for wanting to
> do kernel-level autodetection of these arrays, so I could prevent add_disk
> from being called and analysing the partition table (on these BIOS RAIDs,
> in striped mode the first disk contains the partition table for the entire
> array in sector 0, and if the user (or a script) tries to mount the
> partitions (or even read the extended partition table) it may try to read
> after the end of the disk and will in any case use wrong sector numbers -
> leading to possible disk corruption.

Well, if the device is used by DM at least you cannot mount it anymore
(because it is bd_claimed), but still see and access it via open and
read.

> On top of this it would be useful to make the underlying devices
> inaccessible after the mapped device is created (to prevent people from
> doing things like fdisk /dev/hda, when what they really wanted was
> something like fdisk /dev/ataraid/disc).

I have a really bad idea :)

Try to combine it with udev. udev calls the ide script, the ide script
then calls the ataraid detector. If the device is non-ataraid, go on as
usual. If it is, build the device-mapper device and symlink (if it
doesn't already exist) and tell udev to not create anything.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-10 14:18 ATARAID userspace configuration tool Thomas Horsten
  2004-02-10 14:51 ` Matt Domsch
  2004-02-10 14:58 ` Christophe Saout
@ 2004-02-10 17:38 ` Jeff Garzik
  2004-02-10 17:47   ` Thomas Horsten
  2004-02-10 22:41 ` Neil Brown
  3 siblings, 1 reply; 15+ messages in thread
From: Jeff Garzik @ 2004-02-10 17:38 UTC (permalink / raw)
  To: Thomas Horsten; +Cc: linux-kernel, linux-raid

Thomas Horsten wrote:
> - Is there a "recommended" way to enumerate all block devices (not
> partitions) from userside? Since this is ATA RAID, I could of course just
> read the ideX majors from /proc/devices and try all the minors, but I
> would prefer to get a list of all detected block devices in a portable
> way.

sysfs, definitely.


> - After I have used the DM (and possible MD for some RAID types) to map
> the ataraid devices, is there a way to remove the partitions from the
> underlying disks from the kernel? This was my main reason for wanting to
> do kernel-level autodetection of these arrays, so I could prevent add_disk
> from being called and analysing the partition table (on these BIOS RAIDs,
> in striped mode the first disk contains the partition table for the entire
> array in sector 0, and if the user (or a script) tries to mount the
> partitions (or even read the extended partition table) it may try to read
> after the end of the disk and will in any case use wrong sector numbers -
> leading to possible disk corruption.

You have control of what happens to the devices.  If you don't want them 
probed for partitions, they won't be..


> On top of this it would be useful to make the underlying devices
> inaccessible after the mapped device is created (to prevent people from
> doing things like fdisk /dev/hda, when what they really wanted was
> something like fdisk /dev/ataraid/disc).

This would be something to talk with the md maintainer about, I think. 
I'm not sure we want to do this, since the user may have a valid reason 
to access the underlying disk.

	Jeff

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-10 17:38 ` Jeff Garzik
@ 2004-02-10 17:47   ` Thomas Horsten
  2004-02-10 18:44     ` Jeff Garzik
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Horsten @ 2004-02-10 17:47 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel, linux-raid

On Tue, 10 Feb 2004, Jeff Garzik wrote:

> > On top of this it would be useful to make the underlying devices
> > inaccessible after the mapped device is created (to prevent people from
> > doing things like fdisk /dev/hda, when what they really wanted was
> > something like fdisk /dev/ataraid/disc).
>
> This would be something to talk with the md maintainer about, I think.
> I'm not sure we want to do this, since the user may have a valid reason
> to access the underlying disk.

That's true of course, one example would be to remove the RAID superblock
with dd. The problem is if this is done by mistake, it could be
catastrophic. It might be enough to remove the wrong partitions (with
BLKPG_DEL_PARTITION, thanks to Matt Domsch for pointing me in the right
direction), it will at least prevent mkfs /dev/hda1 etc, which would have
unforeseeable consequences.

But when the RAID/DM device is up, would it not be possible to generate an
EACCESS or EINUSE error if someone tries to open the underlying device? If
he really wants to do it, he can just stop the DM device first.

// Thomas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-10 14:58 ` Christophe Saout
@ 2004-02-10 18:26   ` Kevin P. Fleming
  2004-02-10 19:18     ` Christophe Saout
  0 siblings, 1 reply; 15+ messages in thread
From: Kevin P. Fleming @ 2004-02-10 18:26 UTC (permalink / raw)
  Cc: linux-kernel, linux-raid

Christophe Saout wrote:

> I have a really bad idea :)
> 
> Try to combine it with udev. udev calls the ide script, the ide script
> then calls the ataraid detector. If the device is non-ataraid, go on as
> usual. If it is, build the device-mapper device and symlink (if it
> doesn't already exist) and tell udev to not create anything.

This is not a bad idea, it's the future. The hotplug mechanism is 
exactly what should be used here. When a block-device hotplug ADD event 
occurs, you look at that device to see if it's something you care about. 
If not, just exit and leave it alone.

Now in the ATARAID case, where you need to see multiple devices before 
you can do anything with them, this means you'd need to keep some 
"state" somewhere about the devices you've seen so far, and the partial 
ATARAID devices they represent. When you get the hotplug event for the 
last piece of a particular ATARAID device, you use DM/MD to set up the 
device and make it available.

The wonderful part of this is, when you do that last step, _another_ 
block-device hotplug ADD event occurs for the new device you just 
created, and if the hotplug scripts are set up to run dmpartx or its 
equivalent for new block-devices, you are done. The partition tables 
_inside_ the ATARAID device will be read, more DM calls will be made to 
make those sub-devices available to userspace and everyone is thrilled 
about the elegance of the solution :-)


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-10 17:47   ` Thomas Horsten
@ 2004-02-10 18:44     ` Jeff Garzik
  0 siblings, 0 replies; 15+ messages in thread
From: Jeff Garzik @ 2004-02-10 18:44 UTC (permalink / raw)
  To: Thomas Horsten; +Cc: linux-kernel, linux-raid

Thomas Horsten wrote:
> That's true of course, one example would be to remove the RAID superblock
> with dd. The problem is if this is done by mistake, it could be
> catastrophic. It might be enough to remove the wrong partitions (with
> BLKPG_DEL_PARTITION, thanks to Matt Domsch for pointing me in the right
> direction), it will at least prevent mkfs /dev/hda1 etc, which would have
> unforeseeable consequences.


There are 1001 things that could have unforseen consequences, when root 
is doing something ;-)

This is getting into the area of standard Linux kernel policy (or lack 
thereof):  let root shoot himself in the foot, if he wishes.

I would certainly -want- to be able to dd -from- my underlying disks, 
even if an ataraid or md device is sitting on top, for example.

	Jeff




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-10 18:26   ` Kevin P. Fleming
@ 2004-02-10 19:18     ` Christophe Saout
  2004-02-10 19:24       ` Kevin P. Fleming
  2004-02-11  1:35       ` Greg KH
  0 siblings, 2 replies; 15+ messages in thread
From: Christophe Saout @ 2004-02-10 19:18 UTC (permalink / raw)
  To: Kevin P. Fleming; +Cc: linux-kernel, linux-raid

Am Di, den 10.02.2004 schrieb Kevin P. Fleming um 19:26:

> > I have a really bad idea :)
> > 
> > Try to combine it with udev. udev calls the ide script, the ide script
> > then calls the ataraid detector. If the device is non-ataraid, go on as
> > usual. If it is, build the device-mapper device and symlink (if it
> > doesn't already exist) and tell udev to not create anything.
> 
> This is not a bad idea, it's the future.

I was just joking. I said that because it's not complete.

> The hotplug mechanism is 
> exactly what should be used here. When a block-device hotplug ADD event 
> occurs, you look at that device to see if it's something you care about. 
> If not, just exit and leave it alone.

udev maintains a database of already created devices. And sysfs is some
sort of database of really existing devices. The "telling udev to not
create the device and instead create it ourself" is bad. We should be
able to tell udev that it should register and create another device
instead. Perhaps udev should know about compound devices.

I'm not sure but if udev knows about compound devices things get a bit
more complicated. A raid 1 setup would continue to work if one of the
devices is unplugged, a raid 0 setup fails to work if one device is
missing. Probably the device should be deleted only when both hard disks
are removed. Also it should be created if only one hard disk gets
plugged in. But on bootup if some script tells udev that one hard disk
is there and some seconds later that the second is also there the tool
shouldn't assume the raid has failed after seeing the first event.

Should we Cc an udev developer for an opinion?

> Now in the ATARAID case, where you need to see multiple devices before 
> you can do anything with them, this means you'd need to keep some 
> "state" somewhere about the devices you've seen so far, and the partial 
> ATARAID devices they represent. When you get the hotplug event for the 
> last piece of a particular ATARAID device, you use DM/MD to set up the 
> device and make it available.

As I said I think it is more complicated.

> The wonderful part of this is, when you do that last step, _another_ 
> block-device hotplug ADD event occurs for the new device you just 
> created, and if the hotplug scripts are set up to run dmpartx or its 
> equivalent for new block-devices, you are done.

Right. dmpartx should run on dm-[0-9]* and md[0-9]* events (but not
recursively of course ;)).

>  The partition tables 
> _inside_ the ATARAID device will be read, more DM calls will be made to 
> make those sub-devices available to userspace and everyone is thrilled 
> about the elegance of the solution :-)

Yes, sounds cool.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-10 19:18     ` Christophe Saout
@ 2004-02-10 19:24       ` Kevin P. Fleming
  2004-02-11  1:35       ` Greg KH
  1 sibling, 0 replies; 15+ messages in thread
From: Kevin P. Fleming @ 2004-02-10 19:24 UTC (permalink / raw)
  Cc: linux-kernel, linux-raid

Christophe Saout wrote:

> Right. dmpartx should run on dm-[0-9]* and md[0-9]* events (but not
> recursively of course ;)).

Oh I don't know, people have been asking for the ability to have 
partitioned MD devices for so long, I can see where someone may very 
well want to be able to have an MD device composed of three disks, 
broken up into LVM2 volumes, one of which contains an image of a Sun 
disk with a Solaris disklabel on it :-) For the little bit of extra time 
that dmpartx would spend looking for partition tables I don't think the 
recursion would be painful (unless dmpartx is unnecessarily noisy or 
something).

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-10 14:18 ATARAID userspace configuration tool Thomas Horsten
                   ` (2 preceding siblings ...)
  2004-02-10 17:38 ` Jeff Garzik
@ 2004-02-10 22:41 ` Neil Brown
  3 siblings, 0 replies; 15+ messages in thread
From: Neil Brown @ 2004-02-10 22:41 UTC (permalink / raw)
  To: Thomas Horsten; +Cc: linux-kernel, linux-raid

On Tuesday February 10, thomas@horsten.com wrote:
> Hi,
> 
> I'm writing a userspace utility to detect/configure Medley (and later
> other) ataraid devices in 2.6.
...
> 
> On top of this it would be useful to make the underlying devices
> inaccessible after the mapped device is created (to prevent people from
> doing things like fdisk /dev/hda, when what they really wanted was
> something like fdisk /dev/ataraid/disc).

The best way to avoid this sort of problem is to change "fdisk" (and
mkfs and fsck and ...) to (optionally) open the device with O_EXCL.

In 2.6, a device that is "claimed" by a kernel subsystem - ie is
mounted, or is part of an MD or DM array, or has a partition which is
claimed in one of these ways, cannot be opened O_EXCL.

So these tools that operate on block devices and expect exclusive
access should ask for it.
They probably should have a way to not use O_EXCL if the admin
promises they know what they are doing, as fsck does need to run on a
mounted partition some times, and fdisk can reasonably be used on
devices with mounted partitions.  But the default should be O_EXCL.

NeilBrown

> 
> Detecting the partition table in userspace would fix this, but it's not
> planned before 2.7 and I don't think it is safe to leave the false
> partitions exposed.
> 
> - Some RAID types will need (I think) to use the MD framework as well as
> DM (e.g. RAID0+1), so the device the users would be the md device which
> would be composed of two dm devices. Is there a way to hide the underlying
> dm devices from the user so he they only see the ones they should use (or
> prevent these from being used directly some other way)?
> 
> // Thomas
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-10 19:18     ` Christophe Saout
  2004-02-10 19:24       ` Kevin P. Fleming
@ 2004-02-11  1:35       ` Greg KH
  2004-02-11  1:45         ` Kevin P. Fleming
  1 sibling, 1 reply; 15+ messages in thread
From: Greg KH @ 2004-02-11  1:35 UTC (permalink / raw)
  To: Christophe Saout; +Cc: Kevin P. Fleming, linux-kernel, linux-raid

On Tue, Feb 10, 2004 at 08:18:34PM +0100, Christophe Saout wrote:
> 
> udev maintains a database of already created devices. And sysfs is some
> sort of database of really existing devices. The "telling udev to not
> create the device and instead create it ourself" is bad. We should be
> able to tell udev that it should register and create another device
> instead. Perhaps udev should know about compound devices.
> 
> I'm not sure but if udev knows about compound devices things get a bit
> more complicated. A raid 1 setup would continue to work if one of the
> devices is unplugged, a raid 0 setup fails to work if one device is
> missing. Probably the device should be deleted only when both hard disks
> are removed. Also it should be created if only one hard disk gets
> plugged in. But on bootup if some script tells udev that one hard disk
> is there and some seconds later that the second is also there the tool
> shouldn't assume the raid has failed after seeing the first event.
> 
> Should we Cc an udev developer for an opinion?

udev can either ignore compound devices with a rule that matches the
dm-* block devices, or it can do something about them.

I really don't think udev in and of itself needs to know anything
special about these kinds of devices, as it will be glad to kick off
other programs for you if you want it to.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-11  1:35       ` Greg KH
@ 2004-02-11  1:45         ` Kevin P. Fleming
  2004-02-11 11:34           ` Christophe Saout
  0 siblings, 1 reply; 15+ messages in thread
From: Kevin P. Fleming @ 2004-02-11  1:45 UTC (permalink / raw)
  Cc: linux-kernel, linux-raid

Greg KH wrote:

> I really don't think udev in and of itself needs to know anything
> special about these kinds of devices, as it will be glad to kick off
> other programs for you if you want it to.

Agreed (not that I'm implementing any of this stuff :-)

The tricky part is for Thomas' ataraid-detect program to keep some 
information around when it has seen the first component of a RAID-0 but 
not the second (or vice-versa). It would be very inefficient to scan all 
known block devices every time a new one is added, although that 
brute-force method could be used just to get the program working at 
first. Once the whole idea has been tested and works properly (the 
ATARAID devices become available and function properly), the efficiency 
problem(s) could be addressed.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-11  1:45         ` Kevin P. Fleming
@ 2004-02-11 11:34           ` Christophe Saout
  2004-02-11 14:18             ` Kevin P. Fleming
  0 siblings, 1 reply; 15+ messages in thread
From: Christophe Saout @ 2004-02-11 11:34 UTC (permalink / raw)
  To: Kevin P. Fleming; +Cc: linux-kernel, linux-raid

Am Mi, den 11.02.2004 schrieb Kevin P. Fleming um 02:45:

> The tricky part is for Thomas' ataraid-detect program to keep some 
> information around when it has seen the first component of a RAID-0 but 
> not the second (or vice-versa). It would be very inefficient to scan all 
> known block devices every time a new one is added, although that 
> brute-force method could be used just to get the program working at 
> first. Once the whole idea has been tested and works properly (the 
> ATARAID devices become available and function properly), the efficiency 
> problem(s) could be addressed.

Aren't the disks the ATARAID is made of usually on the same controller?
Then you only have to scan that one.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-11 11:34           ` Christophe Saout
@ 2004-02-11 14:18             ` Kevin P. Fleming
  2004-02-11 19:48               ` Christophe Saout
  0 siblings, 1 reply; 15+ messages in thread
From: Kevin P. Fleming @ 2004-02-11 14:18 UTC (permalink / raw)
  Cc: linux-kernel, linux-raid

Christophe Saout wrote:

> Aren't the disks the ATARAID is made of usually on the same controller?
> Then you only have to scan that one.

Yes, that would be a simple optimization for this case. I was 
envisioning the future tools to handle existing MD and LVM autodetection 
and that would require looking at all potential block devices.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: ATARAID userspace configuration tool
  2004-02-11 14:18             ` Kevin P. Fleming
@ 2004-02-11 19:48               ` Christophe Saout
  0 siblings, 0 replies; 15+ messages in thread
From: Christophe Saout @ 2004-02-11 19:48 UTC (permalink / raw)
  To: Kevin P. Fleming; +Cc: linux-kernel, linux-raid, Greg KH

[-- Attachment #1: Type: text/plain, Size: 3749 bytes --]

Am Mi, den 11.02.2004 schrieb Kevin P. Fleming um 15:18:

> > Aren't the disks the ATARAID is made of usually on the same controller?
> > Then you only have to scan that one.
> 
> Yes, that would be a simple optimization for this case. I was 
> envisioning the future tools to handle existing MD and LVM autodetection 
> and that would require looking at all potential block devices.

I've been prototyping something as a shell script.

The shell script needs to be run from hotplug (/etc/hotplug.d/defaults/
symlink) and udev (as first in udev.rules: PROGRAM="/path/to/bdev.sh %k
%M:%m", NAME="").

adding devices is caught through udev, removing through hotplug (udev
doesn't call programs when devices get removed).

The udev part could be dropped but I thought that someone should tell
udev if either the device should be created as usual (unhandled), not be
created or created under a different name. Unfortunately the ignore rule
in udev is currently broken and I found out that returning an empty
string from the program makes udev try to mknod /dev and chmod /dev (and
sets it to 666, argh) instead of ignoring the device creation. Hmm.

Well, the script maintains a stupid database:

/dev/.bdev/of/<devname> for devices that have been recognized by the
script. The contents of the file is the major:minor pair and a list of
compound devices that use the device.

and

/dev/.bdev/to/<major>:<minor> for created compound devices (or
partitions). It contains the type of the device, the assigned name and
some private data.

/dev/.bdev/uptodate is created when all devices were scanned.

BTW: I added some dumb locking (which is needed) using a package called
dotlockfile.

What happens?

When the script is started it scans /dev/block/* for all devices and
creates all /dev/.bdev/of/ files and touches uptodate. If uptodate
already existed, it registers only the new device.

Partitions detected by the kernel are completely ignored (and it tries
to tell udev not to create the device nodes, currently broken).

Then it calls the add_dev function. Here all checking should be done
(ataraid, other raid, whatever) and as a last resort partition
detection. It currently tries to do partition detection.

It creates a temporary device node, calls sfdisk on it to dump the
partition information, assigns the partition devices a name (the one
sfdisk chooses), calls dmsetup to create the mapped device and registers
the device in the /dev/.bdev/to/<major>:<minor> database and lists it in
the original /dev/.bdev/of/<oldname> file.

Now the kernel will call udev again with the dm device, bdev.sh will be
called, register the device and see that a /dev/.bdev/to/<major>:<minor>
exists for it. It will then create the device node with the name it
registered in the database (and try to tell udev to not care).

When a device is removed everything is done in reverse order.

When a device is removed that has partitions a function notify_dev kicks
in which probably wants to remove the mappings (partitions, etc...).

With tail -f /tmp/log you can watch the debug messages.

I've tried it using a LVM device which has a partition table on it.

lvchange -a y /dev/vg/test

The kernel will send a notify that a dm-2 254:2 was created. bdev.sh
will find a partition table, call dmsetup to create a "part-dm-2p1" dm
device and create a /dev/.bdev/to/254:3 with dm-2p1 as name. The kernel
will call bdev.sh again with dm-3 254:3, bdev.sh sees the
/dev/.bdev/to/254:3 and create the device node /dev/dm-2p1

dmsetup remove part-dm-2p1

will remove the device node and the database will be updated
accordingly.

Well, it's hard to explain my thoughts here because it's somewhat
complicated... perhaps someone understands what I'm trying to prove
here. :/


[-- Attachment #2: bdev.sh --]
[-- Type: text/x-sh, Size: 3175 bytes --]

#!/bin/sh
DATABASE=/dev/.bdev

mk_dev() {
	NAME=$1-${2/\//-}
	(
		/sbin/dmsetup -v create $NAME || \
		/sbin/dmsetup remove $NAME &> /dev/null
	) | \
	sed -e '/minor:/!d;s/^[^0-9]*\([0-9]*\),[^0-9]*\([0-9]*\).*$/\1:\2/'
}

rm_dev() {
	NAME=$1-${2/\//-}
	/sbin/dmsetup remove $NAME &> /dev/null
}

mk_nod() {
	/bin/mknod /dev/$1 b ${2%:[0-9]*} ${2#[0-9]*:}
} 

rm_nod() {
	rm -f /dev/$1
}

####################################################################

mk_part() {
	echo part $1 $2 $3 $4 >> /tmp/log
	NDEV=$(echo "0 $3 linear $4 $2" | mk_dev part $1 $4)
	if [ -z "$NDEV" ]; then
		return 1
	fi
	OF=$DATABASE/of/$5
	TO=$DATABASE/to/$NDEV
	echo TYPE=part > $TO
	echo NAME=$1 >> $TO
	echo "LIST=\"\$LIST $NDEV\"" >> $OF
}

get_parts() {
	/sbin/sfdisk -dfqL /dev/$1 2> /dev/null | \
	sed -s '/start=/!d;s/[=,]/ /g;s/^\/dev\/tmp-//' | \
	awk '{ print $1 " " $4 " " $6 }'
}

check_parts() {
	get_parts $4 | \
	while read PART START SIZE; do
		if [ "$SIZE" -gt 0 ]; then
			mk_part $PART $START $SIZE $3 $2
		fi
	done
}

rm_part() {
	rm_dev part $1
}

##################################################################

register_dev() {
	echo register $1 $2 >> /tmp/log
	echo DEV=$2 > $DATABASE/of/$1
}

unregister_dev() {
	echo unregister $1 $2 >> /tmp/log
	rm -f $DATABASE/of/$1
}

add_dev() {
	if [ -f $DATABASE/to/$2 ]; then
		source $DATABASE/to/$2
	else
		NAME=$1
		RET=1
	fi
	echo add "$NAME $2" >> /tmp/log
	TMP=tmp-$NAME
	mk_nod $TMP $2
	check_parts $1 $NAME $2 $TMP
	rm_nod $TMP
}

notify_dev() {
	echo notify $TYPE $1 $2 >> /tmp/log
	case $TYPE in
	    part)
		rm_part $1 $2
		;;
	esac
}

remove_dev() {
	if [ -f $DATABASE/to/$2 ]; then
		source $DATABASE/to/$2
		rm -f $DATABASE/to/$2
	else
		NAME=$1
		RET=1
	fi
	echo remove $NAME $2 >> /tmp/log
	for i in $LIST; do
		if [ -e $DATABASE/to/$i ]; then
			source $DATABASE/to/$i
			notify_dev $NAME $i
		fi
	done
}

##################################################################

if [ ${DEVPATH} == ${DEVPATH#/block} ]; then
	exit 1
fi

if [ -n "$2" ]; then
	ACTION=add
	KERNEL=$1
	DEV=$2
	if [ ! -f /sys/block/$KERNEL/dev ]; then
		exit 0
	fi
else
	if [ "$ACTION" != remove ]; then
		exit 0
	fi
	KERNEL=${DEVPATH##*/}
fi

dotlockfile -r3 -p $DATABASE/lock || exit 1

RET=0
NAME=""

echo action $ACTION >> /tmp/log
case "$ACTION" in
    add)
	if [ ! -f $DATABASE/uptodate ]; then
		rm -Rf $DATABASE/of $DATABASE/to
		mkdir -p $DATABASE/of $DATABASE/to
		for i in /sys/block/*; do
			register_dev ${i#/sys/block/} $(<$i/dev)
		done
		touch $DATABASE/uptodate
		if [ -e $DATABASE/of/$KERNEL ]; then
			add_dev $KERNEL $DEV
		fi
	else
		if [ ! -e $DATABASE/of/$KERNEL ]; then
			register_dev $KERNEL $DEV
			if [ -e $DATABASE/of/$KERNEL ]; then
				add_dev $KERNEL $DEV
			fi
		fi
	fi

	;;
    remove)
	if [ -e $DATABASE/of/$KERNEL ]; then
		LIST=""
		source $DATABASE/of/$KERNEL
		remove_dev $KERNEL $DEV
		unregister_dev $KERNEL $DEV
	fi
	;;
esac
	
dotlockfile -u $DATABASE/lock
if [ $RET -gt 0 ]; then
	exit 1
else
	if [ -n "$NAME" ]; then
		echo result $NAME >> /tmp/log
		case $ACTION in
		    add)
			mk_nod $NAME $DEV
			;;
		    remove)
			rm_nod $NAME
		esac
	else
		echo ignore >> /tmp/log
	fi
	exit 0
fi

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-02-11 19:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-10 14:18 ATARAID userspace configuration tool Thomas Horsten
2004-02-10 14:51 ` Matt Domsch
2004-02-10 14:58 ` Christophe Saout
2004-02-10 18:26   ` Kevin P. Fleming
2004-02-10 19:18     ` Christophe Saout
2004-02-10 19:24       ` Kevin P. Fleming
2004-02-11  1:35       ` Greg KH
2004-02-11  1:45         ` Kevin P. Fleming
2004-02-11 11:34           ` Christophe Saout
2004-02-11 14:18             ` Kevin P. Fleming
2004-02-11 19:48               ` Christophe Saout
2004-02-10 17:38 ` Jeff Garzik
2004-02-10 17:47   ` Thomas Horsten
2004-02-10 18:44     ` Jeff Garzik
2004-02-10 22:41 ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).