Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree -- d

linux-hotplug.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree -- d
@ 2001-02-08 16:45 Adam J. Richter
  2001-02-08 18:37 ` Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree Oliver Neukum
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Adam J. Richter @ 2001-02-08 16:45 UTC (permalink / raw)
  To: linux-hotplug

>From: Oliver Neukum <Oliver.Neukum@lrz.uni-muenchen.de>

>> >You need to prevent any changes to the bus while you do the scanning.
>> >Running hotplug support is not enough. You need further kernel support
>> >to do racefree scanning.
>>
>> 	You cannot lock physical reality.

>To a certain extent you can. You cannot hide removal, but you can hide 
>addition. That's enough.

	You'll have to show me an example of where not hiding addition
casses trouble.


>> 	In any case, the doubt that I originally expressed was about
>> the need to queue hotplug events _before_ the hot plug system was
>> initialized.  In your example, the events occur after the hot plug
>> system is initialized.

>No, it occurs while you are initialising the hotplug system.

	In the timeline that you listed, the hot plug events occurred
*after* the hot plug system was initialized (but while the existing
hardware was being scanned).  The doubt that I originally expressed
was about the need to queue hotplug events *before* the hot plug
system was initialized.

>You cannot scan a bus without locking it.

	You have not shown that or even defined it well at this point.

>While you lock the bus what do you do with the events ?

	While you *scan* the bus, the kernel would continue
to generate asynchronous calls to /sbin/hotplug, which would
block on the lock held by the currently running /sbin/hotplug
in the algorithm that I described.  When the currently
running /sbin/hotplug finishes, the second /sbin/hotplug
that is waiting will catch any changes that happend while
the previous one was running.

>You could indeed do without queueing initial events, if you did
>scan the bus under lock.
>Provided that you give stable names (David's definition) to the agents
>in order to be able to loose information about ordering of events.

	For the algorithm that I posted, a naming scheme like the
one used by /proc/bus/usb is sufficient and there is no need to
do anything more to preserve the order of events.

>> 	However, now that you mention it, let's talk about handling
>> events that occur after initialization.  I think we can avoid the
>> need to queue those events too.
>>
>> 	First of all, we should recognize that for most hardware,
>> the events caused by unplugging are handled directly by the kernel,
>> not by the user level hot plugging code.  For example, with a DHCP
>> configured ethernet, removal of the ethernet interface card should
>> cause the socket that dhcp has open on the ethernet interface to
>> return an IO exception condition, and dhclient should get -EIO
>> and abort when it tries to execute the ioctl to check the status.

>I am afraid I have to disagree. Most hardware is associated with device nodes
>in /dev which retain their permission bits. You have to reset them on removal.

	I am not talking about returing -EPERM on open (which should
fail if a device is no longer present anyhow), I am talking about
returning -EIO or something similar on already open file descriptors.

>> 	Secondly, there is a way to get this processing right
>> where necessary without the need to queue events (which can overflow,
>> and involve maintiaining arbitrary large dynamic data strucutres).
>> All you need is a "new" flag that the kernel would set when on a device
>> when it is inserted.  The hot plug code would be called by the kernel
>> with an argument indicating what device to check, without necessarily
>> even indicating whether it was a hot plug or a remove event.
>>
>> 	userland_hotplug_handler(dev)
>> 	{
>> 		// was_plugged_in[dev] is persistent data, perhaps
>> 		// stored in a file.
>>
>> 		acquire_lock(dev);   // Flock some file; could just have
>> 				     // one global lock.  Whatever.
>> 		if (was_plugged_in[dev]
>> 		    && (new[dev] || !is_plugged_in[dev])) {
>				^ race condition, the condition you check for might change
>> 			was_plugged_in[dev] = 0;
>> 			handle_remove_event(dev);
>> 		}

>and you may forget a removal event this way, which is bad

>> 		if (new[dev]) {
>> 			new[dev] = 0;
>> 			if (is_plugged_in[dev]) {
>> 				was_plugged_in[dev] = 1;
>> 				handle_insert_event(device);
>> 			}
>> 		}
>> 		release_lock(dev);
>> 	}

	(Note: I have deleted the line near the bottom that read
"old_status[dev] = new_status;".  It was left over from a previous
edit and did not belong in this listing.)

	The "race condition" that you described is not a problem
because the kernel only *sets* new[dev], and only does so when
it detects an insertion, and then spawns a new /sbin/hotplug, which
will run after the currently running /sbin/hotplug releases its
lock and will process the new device if the previous /sbin/hotplug
instance did not already do so.  /sbin/hotplug only *clears* new[dev].

	If you do not understand, try making a timeline of an example.

Adam J. Richter     __     ______________   4880 Stevens Creek Blvd, Suite 104
adam@yggdrasil.com     \ /                  San Jose, California 95129-1034
+1 408 261-6630         | g g d r a s i l   United States of America
fax +1 408 261-6631      "Free Software For The Rest Of Us."

_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree
  2001-02-08 16:45 Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree -- d Adam J. Richter
@ 2001-02-08 18:37 ` Oliver Neukum
  2001-02-08 20:40 ` Adam J. Richter
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Oliver Neukum @ 2001-02-08 18:37 UTC (permalink / raw)
  To: linux-hotplug


> >To a certain extent you can. You cannot hide removal, but you can hide
> >addition. That's enough.
>
> 	You'll have to show me an example of where not hiding addition
> casses trouble.

Anywhere where you do (short form):

if (type_of(dev) = TYPEA)
	init_typeA(dev);
if (type_of(dev) = TYPEB)
	init_typeB(dev);

> 	In the timeline that you listed, the hot plug events occurred
> *after* the hot plug system was initialized (but while the existing
> hardware was being scanned).  The doubt that I originally expressed
> was about the need to queue hotplug events *before* the hot plug
> system was initialized.

Now I see the misunderstanding.
You are right, strictly speaking you can get away without queueing
events before initialisation. But there is a time during initialisation
where you need to queue as the bus must not change while you scan it.

> >You cannot scan a bus without locking it.
>
> 	You have not shown that or even defined it well at this point.

For the same reason you must not add a device while a script is running.

Scanning a bus will look at some level like.
for (i = 0; i<NUMBER_DEVS;i ++) //or equivalent with a list
	dev = dev_list[0];
		if (type_of(dev) = ...
>
> >While you lock the bus what do you do with the events ?
>
> 	While you *scan* the bus, the kernel would continue
> to generate asynchronous calls to /sbin/hotplug, which would
> block on the lock held by the currently running /sbin/hotplug
> in the algorithm that I described.  When the currently
> running /sbin/hotplug finishes, the second /sbin/hotplug
> that is waiting will catch any changes that happend while
> the previous one was running.

That's the problem.
add for dev A - node 0
lock taken
		dev A removed
		add dev B - node 0 reused
init for device of type A
release lock
remove for dev A - node 0

You see, it is too late.
You must not assume that the second script can undo what the first has done.

> 	For the algorithm that I posted, a naming scheme like the
> one used by /proc/bus/usb is sufficient and there is no need to
> do anything more to preserve the order of events.

No there isn't, but you still you must lock and queue events happening while 
the lock is held.

> 	I am not talking about returing -EPERM on open (which should
> fail if a device is no longer present anyhow), I am talking about
> returning -EIO or something similar on already open file descriptors.

Those that are already open are not a problem.
You must reset permissions before the node can be reused.
Or you use unique names which are not supported at all at present.

> >> 	Secondly, there is a way to get this processing right
> >> where necessary without the need to queue events (which can overflow,
> >> and involve maintiaining arbitrary large dynamic data strucutres).
> >> All you need is a "new" flag that the kernel would set when on a device
> >> when it is inserted.  The hot plug code would be called by the kernel
> >> with an argument indicating what device to check, without necessarily
> >> even indicating whether it was a hot plug or a remove event.
> >>
> >> 	userland_hotplug_handler(dev)
> >> 	{
> >> 		// was_plugged_in[dev] is persistent data, perhaps
> >> 		// stored in a file.
> >>
> >> 		acquire_lock(dev);   // Flock some file; could just have
> >> 				     // one global lock.  Whatever.
> >> 		if (was_plugged_in[dev]
> >> 		    && (new[dev] || !is_plugged_in[dev])) {
> >
> >				^ race condition, the condition you check for might change
> >
> >> 			was_plugged_in[dev] = 0;
> >> 			handle_remove_event(dev);
> >> 		}
> >
> >and you may forget a removal event this way, which is bad
> >
> >> 		if (new[dev]) {
> >> 			new[dev] = 0;
> >> 			if (is_plugged_in[dev]) {
> >> 				was_plugged_in[dev] = 1;
> >> 				handle_insert_event(device);
> >> 			}
> >> 		}
> >> 		release_lock(dev);
> >> 	}
>
> 	(Note: I have deleted the line near the bottom that read
> "old_status[dev] = new_status;".  It was left over from a previous
> edit and did not belong in this listing.)
>
> 	The "race condition" that you described is not a problem
> because the kernel only *sets* new[dev], and only does so when
> it detects an insertion, and then spawns a new /sbin/hotplug, which
> will run after the currently running /sbin/hotplug releases its
> lock and will process the new device if the previous /sbin/hotplug
> instance did not already do so.  /sbin/hotplug only *clears* new[dev].
>
> 	If you do not understand, try making a timeline of an example.

That script suffers from the same problem as described above.
In addition you use is_plugged_in, which presumably reflects current status 
and therefore can change under your feet.
Furthermore by using new as a simple flag you can kill a legitimate new.

	Regards
		Oliver

_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree
  2001-02-08 16:45 Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree -- d Adam J. Richter
  2001-02-08 18:37 ` Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree Oliver Neukum
@ 2001-02-08 20:40 ` Adam J. Richter
  2001-02-08 22:02 ` Oliver Neukum
  2001-02-09 20:34 ` Adam J. Richter
  3 siblings, 0 replies; 5+ messages in thread
From: Adam J. Richter @ 2001-02-08 20:40 UTC (permalink / raw)
  To: linux-hotplug

>> >To a certain extent you can. You cannot hide removal, but you can hide
>> >addition. That's enough.
>>
>> 	You'll have to show me an example of where not hiding addition
>> casses trouble.

>Anywhere where you do (short form):

>if (type_of(dev) = TYPEA)
>	init_typeA(dev);
>if (type_of(dev) = TYPEB)
>	init_typeB(dev);

	You have already agreed that you cannot "lock" extraction
events, which is something that must have occured for device A
to be replaced by device B.  Queuing hot plug events in the kernel
and delaying publishing of insert events in /proc would not
change the fact that the typeA device had already been physically
removed under this scenario.  All that you scheme would do would be
to delay the correct call to init_typeB (because the typeB device
had not yet been detected).

	Under any algorithm, no matter how much "locking" you do
(in the absense of physical restratiants), the user can yank device
A a nanosecond after it is detected, at which point you have to rely
on kernel's handling of the appropriate interrupts and init_typeA's
error handling code to fail without crashing the system.

	Likewise, under any algorithm, under any "locking" scheme, in
the case where the user yanks device A a nanosecond after it is
detected and inserts device B a nanosecond later, you have rely on
the kernel's handling of those interrupts and init_typeA's error
handling to fail gracefully.  For example, with PCI hot plugging,
you have rely on the fact that the PCI Base Address Registers of
the newly inserted card have not yet been set, so it's IO is not
yet initialized.  In the case of USB, you rely on the fact that
the USB port is automatically put into a "device present but not
connected" state by the USB hub.

	As far as I can tell, the only other synchronization issue
in quickly replacing device A with device B is ensuring that
device_A_removed() completes before device_B_inserted() is called,
and the algorithm that I posted does this.



>> 	In the timeline that you listed, the hot plug events occurred
>> *after* the hot plug system was initialized (but while the existing
>> hardware was being scanned).  The doubt that I originally expressed
>> was about the need to queue hotplug events *before* the hot plug
>> system was initialized.

>Now I see the misunderstanding.
>You are right, strictly speaking you can get away without queueing
>events before initialisation. But there is a time during initialisation
>where you need to queue as the bus must not change while you scan it.

	That is your assertion, not an example.  Please show a timeline
where my code fails in a way that cannot occur under your scheme of
event queuing and delaying publication of insert events into /proc
while immediately publishing extraction events.

	Perhaps your concern is about simple data corruption while
reading /proc?  That is, perhaps you think the result of the read
will return something that some combination of recent states of
each slot on the bus (e.g., perhaps you read half of a device ID
when device A is plugged in and half of the ID when device B is
plugged in)?  That can be addressed by just having any file
descriptor on that /proc file that was open during the hot plug
even return -EIO, at which point the hotplug user program can
reread the file without out need for locking.  If you think
there is some race condition there, bear in mind that my
pseudo-code clears new[dev] before handle_insert_event() reads
the device ID and calls the appropriate handler.  So, in order
for there to be a device change after than and before
handle_insert_event(), the kernel would have to have set new[dev] back
to 1 and called hotplug again, so that second hotplug instance will
run and install the driver for the new device after the first
instance of hotplug has finished.  Here is a timeline.

Physical			/sbin/hotplug		/sbin/hotplug
reality     Kernel		instance #1		instance #2
insert A
	    interrupt?
	    new[dev]=1
            /sbin/hotplug(dev)#1
				lock(dev)
				new[dev] = 0;
				insert_event(dev);
				[identifies device A]
remove A
	    interrupt?
	    /sbin/hotplug(dev)#2
insert B
	    interrupt?
	    new[dev] = 1;
	    /sbin/hotplug(dev)#3
							lock(dev)[blocks]
				devA->insert() fails
				release_lock(dev)
				.			[lock(dev) returns]
				.			remove_event(dev);
				exit [whenever]		new[dev] = 0;	
							remove_event(dev);
							[No remove handler
							registered since
							devA->insert failed.]
							insert_event(dev)
							[identifies device B]
							devB->insert()
							devB->insert succeeds
							release_lock(dev)
							exit

Note that there is a third instance of /sbin/hotplug(dev) that gets run.
It will scan the bus, after hotplug #2 see no changes and exit, because
instance #2 handled both remove A and insert B because they happened so
fast in this race example.

>> >You cannot scan a bus without locking it.
>>
>> 	You have not shown that or even defined it well at this point.

>For the same reason you must not add a device while a script is running.

	In the absense of a physical restraint, the user can do whatever
he or she wants.  That is why we have things like interrupts and
devices starting in a semi-disconnected state when inserted under most
(all?) hot plugging hardware schemes.  

>That's the problem.
>add for dev A - node 0
>lock taken
>		dev A removed
>		add dev B - node 0 reused
>init for device of type A

	^^^^^ this initialization will fail out.  In the meantime,
under my scheme, the kernel has reset new[node0] = 1 and invoked
a new /sbin/hotplug, which will run after the previous /sbin/hotplug
releases its lock (say, an flock on a file or an SysV-IPC semaphore),
and at that point will load the driver for device B.

>release lock
>remove for dev A - node 0

[...]

	Let's look at some real hot plug busses for example.

	If USB kernel driver A is loaded while A is still plugged in, it
will successfully identify A, and then the will get recognizable errors
when it does its IO requests after device A is removed.

	If A was removed before driver A loads (and, in your example
device B is inserted), the device desciptors and interface descriptors will
not match what the driver wants, and initialization will return failure.
Only after that point, will /sbin/hotplug #1 release its lock and exit,
allowing the two /sbin/hotplug's that were spawned when device A was
removed and device B was inserted to run.  driver B will be loaded by
hotplug #2 or #3 (depending on how quickly device B was inserted and how
long hotplug #1 took to run).

	Under CardBus (PCI), the a similar thing will happen using PCI
device and vendor ID's.  The newly inserted device B's PCI base address
registers will not be set at that point, so driver A does not have
to worry about poking IO ports on the newly inserted card B (the code
that actually reads the device ID's and sets that BAR's does have to
be careful and watch the appropriate flags in case this occurs).


>You must not assume that the second script can undo what the first has done.

	I can assume that deviceA->remove() can undo deviceA->insert() if
deviceA->insert() ran to completion and returned "success."

>> 	For the algorithm that I posted, a naming scheme like the
>> one used by /proc/bus/usb is sufficient and there is no need to
>> do anything more to preserve the order of events.

>No there isn't, but you still you must lock and queue events happening while 
>the lock is held.

	I am going to "[snip]" unsupported assertions from your email
in the future.  You have been warned.


>> 	I am not talking about returing -EPERM on open (which should
>> fail if a device is no longer present anyhow), I am talking about
>> returning -EIO or something similar on already open file descriptors.

>Those that are already open are not a problem.
>You must reset permissions before the node can be reused.
>Or you use unique names which are not supported at all at present.


	When init_typeA can falsely succeed on deviceB, remove_typeA
will successfully shut it down a moment later and init_typeB() will
then be run and initialize device B correctly.  The truely brief period
when deviceB is misinitialized (the new /sbin/hotplug is just waiting
for the semaphore from the previous one to be released) occurs under
my scheme in a *subset* of the times when it occurs under yours.
Also, the only reason why this can happen is when this situation
is considered harmless enough so that init_typeA() does not have code
to guard against it (it can always scrutinize the device ID information
more carefully and check that the equivalent of new[dev] is not set to
guard against changes after it has read the device ID information).

	Let's go through an example.

		deviceA = plain old USB ethernet
		deviceB = wireless USB ethernet that needs some
			iwconfig settings to actually see the traffic.

	What would happen is:

Physical reality		Computer

insert regular ethernet (devA)
				/sbin/hotplug #1 starts
				sees regular ethernet (devA)
				devA->insert invokes "dhclient eth0 &"
remove regular ethernet (devA)
insert wireless ethernet (devB)
				/sbin/hotplug #1 took this long to exit
				for some reason.

				/sbin/hotplug #2 starts
				Sees new[dev], calls devA->remove.
				devA->remove does "kill $(cat /var/run/dhclient.eth0.pid)"
				identifies device B, calls devB->insert
				devB->insert does:
					iwconfig eth0 essid Any
					dhclient eth0 &
					exit
				/sbin/hotplug #2 exits
				/sbin/hotplug #3 finds nothing to do, exits.

[...]
>> >> 	userland_hotplug_handler(dev)
>> >> 	{
>> >> 		// was_plugged_in[dev] is persistent data, perhaps
>> >> 		// stored in a file.
>> >>
>> >> 		acquire_lock(dev);   // Flock some file; could just have
>> >> 				     // one global lock.  Whatever.
>> >> 		if (was_plugged_in[dev]
>> >> 		    && (new[dev] || !is_plugged_in[dev])) {
>> >
>> >				^ race condition, the condition you check for might change
>> >
>> >> 			was_plugged_in[dev] = 0;
>> >> 			handle_remove_event(dev);
>> >> 		}
>> >
>> >and you may forget a removal event this way, which is bad

	The kernel only sets new[dev] and the user level program
only clears it when it believes the socket is already empty (so
there is no removal event to be lost except from a device whose
insert event was never processed--i.e., someone inserted and removed
a card before the insert event was even noticed).

>> >
>> >> 		if (new[dev]) {
>> >> 			new[dev] = 0;
>> >> 			if (is_plugged_in[dev]) {
>> >> 				was_plugged_in[dev] = 1;
>> >> 				handle_insert_event(device);
>> >> 			}
>> >> 		}
>> >> 		release_lock(dev);
>> >> 	}
>>
>> 	(Note: I have deleted the line near the bottom that read
>> "old_status[dev] = new_status;".  It was left over from a previous
>> edit and did not belong in this listing.)
>>
>> 	The "race condition" that you described is not a problem
>> because the kernel only *sets* new[dev], and only does so when
>> it detects an insertion, and then spawns a new /sbin/hotplug, which
>> will run after the currently running /sbin/hotplug releases its
>> lock and will process the new device if the previous /sbin/hotplug
>> instance did not already do so.  /sbin/hotplug only *clears* new[dev].
>>
>> 	If you do not understand, try making a timeline of an example.

>That script suffers from the same problem as described above.

	I do not understand and I really do not understand why you
do not provide an example timeline.

>In addition you use is_plugged_in, which presumably reflects current status 
>and therefore can change under your feet.

	Show me a timeline of where that changing causes a problem.

>Furthermore by using new as a simple flag you can kill a legitimate new.

	     new[dev] is cleared before things like is_plugged_in[dev] is
checked and the device ID is read.  The events that are "killed" are always
events that are no longer applicable (insert events if the inserted
device has already been removed).

Adam J. Richter     __     ______________   4880 Stevens Creek Blvd, Suite 104
adam@yggdrasil.com     \ /                  San Jose, California 95129-1034
+1 408 261-6630         | g g d r a s i l   United States of America
fax +1 408 261-6631      "Free Software For The Rest Of Us."

_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree
  2001-02-08 16:45 Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree -- d Adam J. Richter
  2001-02-08 18:37 ` Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree Oliver Neukum
  2001-02-08 20:40 ` Adam J. Richter
@ 2001-02-08 22:02 ` Oliver Neukum
  2001-02-09 20:34 ` Adam J. Richter
  3 siblings, 0 replies; 5+ messages in thread
From: Oliver Neukum @ 2001-02-08 22:02 UTC (permalink / raw)
  To: linux-hotplug

> >if (type_of(dev) = TYPEA)
> >	init_typeA(dev);
> >if (type_of(dev) = TYPEB)
> >	init_typeB(dev);
>
> 	You have already agreed that you cannot "lock" extraction
> events, which is something that must have occured for device A
> to be replaced by device B.  Queuing hot plug events in the kernel
> and delaying publishing of insert events in /proc would not

I want the device completely inaccessible to user space, not just delay 
publishing.

> change the fact that the typeA device had already been physically
> removed under this scenario.  All that you scheme would do would be
> to delay the correct call to init_typeB (because the typeB device
> had not yet been detected).

If init_typeA were called after device A had been removed it would fail.
If however type_of returned A before device replacement and init_typeA
were called for a device of type B an error would be made.
The wrong method of initialisation would be called resulting in unknown
consequences.

> 	Likewise, under any algorithm, under any "locking" scheme, in
> the case where the user yanks device A a nanosecond after it is
> detected and inserts device B a nanosecond later, you have rely on
> the kernel's handling of those interrupts and init_typeA's error
> handling to fail gracefully.  For example, with PCI hot plugging,
> you have rely on the fact that the PCI Base Address Registers of
> the newly inserted card have not yet been set, so it's IO is not
> yet initialized.  In the case of USB, you rely on the fact that
> the USB port is automatically put into a "device present but not
> connected" state by the USB hub.

You are absolutely correct. I just wish the device to remain in that state
until the running agent has terminated. I believe I have shown the race 
condition that arises otherwise.

> 	Perhaps your concern is about simple data corruption while
> reading /proc?  That is, perhaps you think the result of the read
> will return something that some combination of recent states of
> each slot on the bus (e.g., perhaps you read half of a device ID
> when device A is plugged in and half of the ID when device B is
> plugged in)?  That can be addressed by just having any file

I am worried about the content of that read being no longer true when you act 
upon it.

>
> Physical			/sbin/hotplug		/sbin/hotplug
> reality     Kernel		instance #1		instance #2
> insert A
> 	    interrupt?
> 	    new[dev]=1
>             /sbin/hotplug(dev)#1
> 				lock(dev)
> 				new[dev] = 0;
> 				insert_event(dev);
> 				[identifies device A]
> remove A
> 	    interrupt?
> 	    /sbin/hotplug(dev)#2
> insert B
> 	    interrupt?
> 	    new[dev] = 1;
> 	    /sbin/hotplug(dev)#3
> 							lock(dev)[blocks]
> 				devA->insert() fails

How do you guarantee that devA->insert() fails ?
I see no way.

> 				release_lock(dev)
> 				.			[lock(dev) returns]
> 				.			remove_event(dev);
> 				exit [whenever]		new[dev] = 0;
> 							remove_event(dev);
> 							[No remove handler
> 							registered since
> 							devA->insert failed.]
> 							insert_event(dev)
> 							[identifies device B]
> 							devB->insert()
> 							devB->insert succeeds
> 							release_lock(dev)
> 							exit
>

> >That's the problem.
> >add for dev A - node 0
> >lock taken
> >		dev A removed
> >		add dev B - node 0 reused
> >init for device of type A
>
> 	^^^^^ this initialization will fail out.  In the meantime,

How ? Why ? That is an unfounded assumption. You cannot know that
an arbitrary initialisation will fail.

> under my scheme, the kernel has reset new[node0] = 1 and invoked
> a new /sbin/hotplug, which will run after the previous /sbin/hotplug
> releases its lock (say, an flock on a file or an SysV-IPC semaphore),
> and at that point will load the driver for device B.
>
> >release lock
> >remove for dev A - node 0
>
> [...]
>
> 	Let's look at some real hot plug busses for example.
>
> 	If USB kernel driver A is loaded while A is still plugged in, it
> will successfully identify A, and then the will get recognizable errors
> when it does its IO requests after device A is removed.
>
> 	If A was removed before driver A loads (and, in your example
> device B is inserted), the device desciptors and interface descriptors will
> not match what the driver wants, and initialization will return failure.

How do you guarantee that the user space agent doesn't do something
that doesn't involve the device directly ?
It could start a demon, change permissions on device nodes, change 
/etc/fstab, ...
In fact if it were used only to load a device driver, the kernel could just 
call kmod.

> 	Under CardBus (PCI), the a similar thing will happen using PCI
> device and vendor ID's.  The newly inserted device B's PCI base address
> registers will not be set at that point, so driver A does not have
> to worry about poking IO ports on the newly inserted card B (the code
> that actually reads the device ID's and sets that BAR's does have to
> be careful and watch the appropriate flags in case this occurs).

If you take a look at the USB code, you'll see that the call of the user 
space agent comes _after_ the driver has been probed for.
Thus the driver is bound to the device and functional.

Yet the user space agent may very well wish to access the device, thus there 
is no alternative.

> 	I can assume that deviceA->remove() can undo deviceA->insert() if
> deviceA->insert() ran to completion and returned "success."

Provided it was run on a device it was intended for.
Otherwise you depend on the specifics of the insert method,
which IMHO kernel code must not do.

> 	When init_typeA can falsely succeed on deviceB, remove_typeA
> will successfully shut it down a moment later and init_typeB() will

How do you guarantee that ? I can see no reason you could simply assume 
anything about the result of an initialisation done on a device of type B 
intended for a device of type A.

> then be run and initialize device B correctly.  The truely brief period
> when deviceB is misinitialized (the new /sbin/hotplug is just waiting
> for the semaphore from the previous one to be released) occurs under
> my scheme in a *subset* of the times when it occurs under yours.

Why ? Please explain. Obviously I consider my scheme invulnerable in that 
regard. ;-)

> Also, the only reason why this can happen is when this situation
> is considered harmless enough so that init_typeA() does not have code
> to guard against it (it can always scrutinize the device ID information
> more carefully and check that the equivalent of new[dev] is not set to
> guard against changes after it has read the device ID information).

What if the check of new[] happens before the device is exchanged ?
Is there any way you can make
if (check_identity_really_well(dev))
	intialise(dev);

safe without a lock ?

Now suppose it doesn't need iwconfig (reasonable some don't) and the user 
doesn't want dhcp on a wireless card.

Now you have (if you are really unlucky):

Physical reality				Computer
insert regular ethernet (dev A)
					/sbin/hotplug #1 starts
					sees regular ethernet (dev A)
remove regular ethernet (dev A)
insert wireless ethernet (dev B)
					dhclient eth0 &
					/sbin/hotplug exits

Isn't that an error that cannot be prevented by this design ?					

Now to the timeline for lost removal:

------------------------------------
First device plugged in
------------------------------------
         userland_hotplug_handler(dev)
         {
                 // was_plugged_in[dev] is persistent data, perhaps
                 // stored in a file.

                 acquire_lock(dev);   // Flock some file; could just have
                                     // one global lock.  Whatever.
                 if (was_plugged_in[dev]
                    && (new[dev] || !is_plugged_in[dev])) {
                         was_plugged_in[dev] = 0;
                         handle_remove_event(dev);
                 }
                 if (new[dev]) {
----------------------------------
device removal: new[dev] =1
----------------------------------
                         new[dev] = 0;
                         if (is_plugged_in[dev]) {
                                 was_plugged_in[dev] = 1;
                                 handle_insert_event(device);
                         }
                 }
                 release_lock(dev);
         }

-------------------------------------
second instance of script getting lock, still new[] = 0
-------------------------------------
                 if (was_plugged_in[dev]
                    && (new[dev] ||
-------------------------------------
device addition: new[] = 1
------------------------------------- 
		!is_plugged_in[dev])) {
-------------------------------------
test has failed -> removal lost
-------------------------------------
                         was_plugged_in[dev] = 0;
                         handle_remove_event(dev);
                 }

I hope you can understand it this way, I don't have the real estate
on screen for the conventional way of showing this.

	Kindest Regards (I hope I have made myself clear now)
		Oliver

_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree
  2001-02-08 16:45 Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree -- d Adam J. Richter
                   ` (2 preceding siblings ...)
  2001-02-08 22:02 ` Oliver Neukum
@ 2001-02-09 20:34 ` Adam J. Richter
  3 siblings, 0 replies; 5+ messages in thread
From: Adam J. Richter @ 2001-02-09 20:34 UTC (permalink / raw)
  To: linux-hotplug

	I have responded to Oliver's doubts about the hot plugging
algorithm that I posted and dropped the linux-hotplug list from the
cc list.  I doubt anyone on the list is following exchanges of
enormous email messages, but please let me know if you want to
continue to be cc'ed.

	I don't think Oliver has found a problem with the algorithm
that I posted (which would involve little kernel code if any), but
I imagine one of us will post a summary of our discussion when
it concludes.

Adam J. Richter     __     ______________   4880 Stevens Creek Blvd, Suite 104
adam@yggdrasil.com     \ /                  San Jose, California 95129-1034
+1 408 261-6630         | g g d r a s i l   United States of America
fax +1 408 261-6631      "Free Software For The Rest Of Us."

_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
http://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-02-09 20:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-02-08 16:45 Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree -- d Adam J. Richter
2001-02-08 18:37 ` Algorithm for hotplugging without an event queue (was: Adding PCMCIA support to the kernel tree Oliver Neukum
2001-02-08 20:40 ` Adam J. Richter
2001-02-08 22:02 ` Oliver Neukum
2001-02-09 20:34 ` Adam J. Richter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).