Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
       [not found]     ` <20030116173539.GA31235@kroah.com>
@ 2003-01-16 19:43       ` Matthew Dharm
  2003-01-16 19:53         ` Greg KH
       [not found]         ` <20030116195306.GA32697@kroah.com>
  0 siblings, 2 replies; 106+ messages in thread
From: Matthew Dharm @ 2003-01-16 19:43 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 1367 bytes --]

Well, we only create the host when the device is first attached.  After
that, if it goes away and comes back, we re-connect it to the old SCSI
host.

But, while the device is gone, you've created an association between a SCSI
node that exists and a non-existant USB device.  Basically, you've got a
pointer that is no longer valid.  And when the device is re-attached, there
isn't code to re-establish the correct SCSI<->USB association.

Something like this would proabably make sense if the hot-unplugging code
for SCSI hosts was really stable -- then we could unregister the host when
the device went away, and this relation would be disconnected
automatically.

Matt

On Thu, Jan 16, 2003 at 09:35:39AM -0800, Greg KH wrote:
> On Thu, Jan 16, 2003 at 09:31:12AM -0800, Matthew Dharm wrote:
> > Hrm... doesn't this all fall to pot when the device is unplugged and
> > repluged?
> 
> Um, how?  This seems to work for me, but I don't have a lot of devices
> here.  And if there is a problem, you might want to tell the scsi
> people, as they are the ones advocating this call be added.

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

You are needink to look more evil.  You likink very strong coffee?
					-- Pitr to Dust Puppy
User Friendly, 10/16/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-16 19:43       ` [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 Matthew Dharm
@ 2003-01-16 19:53         ` Greg KH
       [not found]         ` <20030116195306.GA32697@kroah.com>
  1 sibling, 0 replies; 106+ messages in thread
From: Greg KH @ 2003-01-16 19:53 UTC (permalink / raw)
  To: linux-usb-devel, Linux SCSI list

On Thu, Jan 16, 2003 at 11:43:23AM -0800, Matthew Dharm wrote:
> Well, we only create the host when the device is first attached.  After
> that, if it goes away and comes back, we re-connect it to the old SCSI
> host.

Ick, so when the device is gone, where does the SCSI host go?  Is it
still represented in sysfs and in the SCSI core properly?

> But, while the device is gone, you've created an association between a SCSI
> node that exists and a non-existant USB device.  Basically, you've got a
> pointer that is no longer valid.  And when the device is re-attached, there
> isn't code to re-establish the correct SCSI<->USB association.

Why not?  I'm guessing that you re-establish this association, right?
Then you might have to add this same call to whereever that is done.

> Something like this would proabably make sense if the hot-unplugging code
> for SCSI hosts was really stable -- then we could unregister the host when
> the device went away, and this relation would be disconnected
> automatically.

Well, push back on the SCSI people to fix this then, hot-unplug should
work properly on the SCSI layer too :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 106+ messages in thread

[parent not found: <20030116195306.GA32697@kroah.com>]

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
       [not found]         ` <20030116195306.GA32697@kroah.com>
@ 2003-01-16 20:10           ` Linus Torvalds
  2003-01-16 20:43             ` greg kh
                               ` (2 more replies)
  2003-01-16 20:40           ` David Brownell
  1 sibling, 3 replies; 106+ messages in thread
From: Linus Torvalds @ 2003-01-16 20:10 UTC (permalink / raw)
  To: linux-scsi

In article <20030116195306.GA32697@kroah.com>, Greg KH  <greg@kroah.com> wrote:
>On Thu, Jan 16, 2003 at 11:43:23AM -0800, Matthew Dharm wrote:
>> Well, we only create the host when the device is first attached.  After
>> that, if it goes away and comes back, we re-connect it to the old SCSI
>> host.
>
>Ick, so when the device is gone, where does the SCSI host go?  Is it
>still represented in sysfs and in the SCSI core properly?

This is pure and utter USB storage stupidity, and nothing else.

When the USB storage device is unplugged, the device should be
unregistered.  It should be _gone_.  It isn't sleeping, it's dead.  It's
an ex-device. 

The fact that USB storage still keeps track of devices that do not exist
is WRONG. It has resulted in problems in real life multiple times with
devices that get re-attached and have a new serial number (quite common
as far as I can tell in cheap flash readers), where the _stupid_ rule of
trying to keep track of what has been attached results in the device
moving from /dev/sda to sdb to sdc as it is unplugged and re-plugged.

>> Something like this would proabably make sense if the hot-unplugging code
>> for SCSI hosts was really stable -- then we could unregister the host when
>> the device went away, and this relation would be disconnected
>> automatically.
>
>Well, push back on the SCSI people to fix this then, hot-unplug should
>work properly on the SCSI layer too :)

IT IS NOT A SCSI LAYER PROBLEM! It's purely a USB problem.

		Linus

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-16 20:10           ` Linus Torvalds
@ 2003-01-16 20:43             ` greg kh
  2003-01-16 21:41             ` Linus Torvalds
  2003-01-16 22:51             ` Matthew Dharm
  2 siblings, 0 replies; 106+ messages in thread
From: greg kh @ 2003-01-16 20:43 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-usb-devel

Copied to linux-usb-devel, as they should also see this...

On Thu, 16 Jan 2003 12:10:18 -0800, Linus Torvalds wrote:

> In article <20030116195306.GA32697@kroah.com>, Greg KH  <greg@kroah.com>
> wrote:
>>On Thu, Jan 16, 2003 at 11:43:23AM -0800, Matthew Dharm wrote:
>>> Well, we only create the host when the device is first attached. After
>>> that, if it goes away and comes back, we re-connect it to the old SCSI
>>> host.
>>
>>Ick, so when the device is gone, where does the SCSI host go?  Is it
>>still represented in sysfs and in the SCSI core properly?
> 
> This is pure and utter USB storage stupidity, and nothing else.
> 
> When the USB storage device is unplugged, the device should be
> unregistered.  It should be _gone_.  It isn't sleeping, it's dead.  It's
> an ex-device.

Agreed, I thought most of that logic had been removed from the
usb-storage driver as the SCSI layer can now handle removing devices just
fine (or so Mike Anderson tells me :)

thanks,

greg k-h


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-16 20:10           ` Linus Torvalds
  2003-01-16 20:43             ` greg kh
@ 2003-01-16 21:41             ` Linus Torvalds
  2003-01-16 22:51             ` Matthew Dharm
  2 siblings, 0 replies; 106+ messages in thread
From: Linus Torvalds @ 2003-01-16 21:41 UTC (permalink / raw)
  To: linux-scsi

In article <b073ja$12i$1@penguin.transmeta.com>,
Linus Torvalds <torvalds@transmeta.com> wrote:
>
>When the USB storage device is unplugged, the device should be
>unregistered.  It should be _gone_.  It isn't sleeping, it's dead.  It's
>an ex-device. 

As a follow-up on my rant: if people want reliable static naming over
removal/reinsertion of a device, we actually already have exactly that,
in user space. Using the hotplug agents.

There's a mostly unrelated problem we have from a kernel perspective
which is a device that is actually in use when the disconnect happens -
say as a part of a sleep sequence (which will cause a forced disconnect/
reconnect event).

That's a generic hotplug issue, and still should not be a reason for
trying to (on a driver level) keep track of devices that are gone.  It
should really be up to upper layers to be able to re-associate things
properly, since doing it on a driver level simply isn't even possible in
the generic case. 

			Linus

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-16 20:10           ` Linus Torvalds
  2003-01-16 20:43             ` greg kh
  2003-01-16 21:41             ` Linus Torvalds
@ 2003-01-16 22:51             ` Matthew Dharm
  2 siblings, 0 replies; 106+ messages in thread
From: Matthew Dharm @ 2003-01-16 22:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-scsi

[-- Attachment #1: Type: text/plain, Size: 2676 bytes --]

On Thu, Jan 16, 2003 at 08:10:18PM +0000, Linus Torvalds wrote:
> In article <20030116195306.GA32697@kroah.com>, Greg KH  <greg@kroah.com> wrote:
> >On Thu, Jan 16, 2003 at 11:43:23AM -0800, Matthew Dharm wrote:
> >> Well, we only create the host when the device is first attached.  After
> >> that, if it goes away and comes back, we re-connect it to the old SCSI
> >> host.
> >
> >Ick, so when the device is gone, where does the SCSI host go?  Is it
> >still represented in sysfs and in the SCSI core properly?
> 
> This is pure and utter USB storage stupidity, and nothing else.

Well, I happen to agree.  But it was a necessary evil.

Up until recently, an attempt to hot-unplug a SCSI host would result is
gross system instability.

Also, up until recently, the debounce on USB ports means that a usb-storage
device could get disconnected and reconnected when a completely unrelated
device was attached to a hub.  Newer hubs fixed this.

Also, the _vast_ majority of devices keep sane serial numbers.  You happen
to have one of the broken ones, which tends to skew your perception.  But
of the two dozen or so storage devices I have, only one exhibits that
problem (and only with the old firmware loaded).

I'd like to fix this.  I've been wanting to fix this for a while.  I want
SCSI hot-unplug to work well enough to rely on it.  I want to be able to
take a USB disk, unplug it while writing to it, plug in a new USB disk, and
have a guarantee that the SCSI layer won't get confused and try to write
the data for the old disk to the new one (think scsi0 goes away, to be
replaced by a new scsi0, without properly stopping command initiators).

Heck, it was only in early 2.4.x that the SCSI mid-layer fixed an
off-by-one error that made it queue one-too-many commands to a HBA.

So, here's how it is:  We all want it fixed.  Linus, if you're willing to
deal with a major change for this at this late-date in the 2.5.x lifecycle,
then we'll do it.  Simple as that.  That's pretty much the only reason I
haven't started already -- the SCSI add-ons that support the hot-unplug
were too new and too late for me to be comfortable making a major change
like this.

Matt

P.S. BTW, this is also the original behavior of usb-storage, all the way
back to before I was working on it (and before it was called usb-storage).
I've never liked it this way...

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

Hey Chief.  We've figured out how to save the technical department.  We 
need to be committed.
					-- The Techs
User Friendly, 1/22/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
       [not found]         ` <20030116195306.GA32697@kroah.com>
  2003-01-16 20:10           ` Linus Torvalds
@ 2003-01-16 20:40           ` David Brownell
  2003-01-16 20:48             ` Mike Anderson
  1 sibling, 1 reply; 106+ messages in thread
From: David Brownell @ 2003-01-16 20:40 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 1023 bytes --]

Greg KH wrote:
> On Thu, Jan 16, 2003 at 11:43:23AM -0800, Matthew Dharm wrote:
> 
>>Well, we only create the host when the device is first attached.  After
>>that, if it goes away and comes back, we re-connect it to the old SCSI
>>host.
> 
> 
> Ick, so when the device is gone, where does the SCSI host go?  Is it
> still represented in sysfs and in the SCSI core properly?

Just for the record ... I think usb-storage is the only USB driver
that tries to keep state about devices across disconnect/reconnect.

I agree with Greg that hotplugging (including unplug/replug) should
work well with SCSI.  But given the problems of determining "identity"
of a disk (or volume or whatever) I'm sort of curious what working
well should really mean.

Have any of the SCSI people been looking much at SCSI hotplug on 2.5?
I attach "/etc/hotplug/scsi.agent" from one of my desktops; all it
does is make sure the right drivers are loaded, it doesn't have a
clue yet about whether/how/where to mount disks or do other stuff.

- Dave


[-- Attachment #2: scsi.agent --]
[-- Type: text/plain, Size: 1022 bytes --]

#!/bin/bash
#
# SCSI hotplug agent for 2.5 kernels 
#
#	ACTION=add
#	DEVPATH=devices/scsi0/0:0:0:0
#

cd /etc/hotplug
. hotplug.functions

case $ACTION in

add)
    # 2.5.50 kernel bug: this happens sometimes
    if [ ! -d /sys/$DEVPATH ]; then
	mesg "bogus sysfs DEVPATH=$DEVPATH"
	exit 1
    fi

    TYPE=$(cat /sys/$DEVPATH/type)
    case "$TYPE" in
    # 2.5.51 style attributes; <scsi/scsi.h> TYPE_* constants
    0)		TYPE=disk ; MODULE=sd_mod ;;
    # FIXME some tapes use 'osst' not 'st'
    1)		TYPE=tape ; MODULE=st ;;
    2)		TYPE=printer ;;
    3)		TYPE=processor ;;
    4)		TYPE=worm ; MODULE=sr_mod ;;
    5)		TYPE=cdrom ; MODULE=sr_mod ;;
    6)		TYPE=scanner ;;
    7)		TYPE=mod ; MODULE=sd_mod ;;
    8)		TYPE=changer ;;
    9)		TYPE=comm ;;
    14)		TYPE=enclosure ;;
    esac
    if [ "$MODULE" != "" ]; then
	mesg "$TYPE at $DEVPATH"
	modprobe $MODULE
    else
	mesg "how to add device type=$TYPE at $DEVPATH ??"
    fi
    ;;

*)
    debug_mesg SCSI $ACTION event not supported
    exit 1
    ;;

esac

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-16 20:40           ` David Brownell
@ 2003-01-16 20:48             ` Mike Anderson
  2003-01-16 23:43               ` Oliver Neukum
  0 siblings, 1 reply; 106+ messages in thread
From: Mike Anderson @ 2003-01-16 20:48 UTC (permalink / raw)
  To: David Brownell; +Cc: Greg KH, linux-usb-devel, Linux SCSI list

David Brownell [david-b@pacbell.net] wrote:
> Have any of the SCSI people been looking much at SCSI hotplug on 2.5?
> I attach "/etc/hotplug/scsi.agent" from one of my desktops; all it
> does is make sure the right drivers are loaded, it doesn't have a
> clue yet about whether/how/where to mount disks or do other stuff.

SCSI has added newer interfaces for host drivers to use that allow
single adds and removes. The api document is located in
/Documentation/scsi/scsi_mid_low_api.txt. I believe it is fairly
current.

I have been using the scsi_debug driver (which uses the newer interface)
to add and remove pseudo adapters, but have not checked the results
through user space hotplug.

-andmike
--
Michael Anderson
andmike@us.ibm.com

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-16 20:48             ` Mike Anderson
@ 2003-01-16 23:43               ` Oliver Neukum
  2003-01-17  8:50                 ` Mike Anderson
  0 siblings, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-16 23:43 UTC (permalink / raw)
  To: Mike Anderson, David Brownell; +Cc: Greg KH, linux-usb-devel, Linux SCSI list

Am Donnerstag, 16. Januar 2003 21:48 schrieb Mike Anderson:
> David Brownell [david-b@pacbell.net] wrote:
> > Have any of the SCSI people been looking much at SCSI hotplug on 2.5?
> > I attach "/etc/hotplug/scsi.agent" from one of my desktops; all it
> > does is make sure the right drivers are loaded, it doesn't have a
> > clue yet about whether/how/where to mount disks or do other stuff.
>
> SCSI has added newer interfaces for host drivers to use that allow
> single adds and removes. The api document is located in
> /Documentation/scsi/scsi_mid_low_api.txt. I believe it is fairly
> current.

This is good news in principle.
But what use is a function like scsi_remove_host() if it can fail?
If a device is gone, it is gone and all the complaining in the world
won't alter that.
Could you explain how a LLD can ensure that this function will always
succeed?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-16 23:43               ` Oliver Neukum
@ 2003-01-17  8:50                 ` Mike Anderson
  2003-01-17 10:55                   ` Oliver Neukum
  0 siblings, 1 reply; 106+ messages in thread
From: Mike Anderson @ 2003-01-17  8:50 UTC (permalink / raw)
  To: Oliver Neukum; +Cc: David Brownell, Greg KH, linux-usb-devel, Linux SCSI list

Oliver Neukum [oliver@neukum.name] wrote:
> Am Donnerstag, 16. Januar 2003 21:48 schrieb Mike Anderson:
> > David Brownell [david-b@pacbell.net] wrote:
> > > Have any of the SCSI people been looking much at SCSI hotplug on 2.5?
> > > I attach "/etc/hotplug/scsi.agent" from one of my desktops; all it
> > > does is make sure the right drivers are loaded, it doesn't have a
> > > clue yet about whether/how/where to mount disks or do other stuff.
> >
> > SCSI has added newer interfaces for host drivers to use that allow
> > single adds and removes. The api document is located in
> > /Documentation/scsi/scsi_mid_low_api.txt. I believe it is fairly
> > current.
> 
> This is good news in principle.
> But what use is a function like scsi_remove_host() if it can fail?
> If a device is gone, it is gone and all the complaining in the world
> won't alter that.
> Could you explain how a LLD can ensure that this function will always
> succeed?

The interface was focused on clean removes. There may be layers above
the SCSI subsystem using the device that may need user space
intervention to release. Currently the request_queue exported to the
block layer is tied to the scsi device and cannot dis-associated.

(NOTE: in looking at this in 2.5.58 it appears that there maybe some
checks missing in scsi_check_device_busy that would make
scsi_remove_host succeed more often than it should).

If the device disappears the host needs to return all IOs in flight with
an error, and  we need to ensure new ones do not start ( setting the
device offline and maybe host_self_blocked). This is a start on getting
the scsi_remove_host to succed.

-andmike
--
Michael Anderson
andmike@us.ibm.com

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-17  8:50                 ` Mike Anderson
@ 2003-01-17 10:55                   ` Oliver Neukum
  2003-01-17 15:06                     ` Alan Stern
  2003-01-17 18:54                     ` Matthew Dharm
  0 siblings, 2 replies; 106+ messages in thread
From: Oliver Neukum @ 2003-01-17 10:55 UTC (permalink / raw)
  To: Mike Anderson; +Cc: David Brownell, Greg KH, linux-usb-devel, Linux SCSI list

Am Freitag, 17. Januar 2003 09:50 schrieb Mike Anderson:
> Oliver Neukum [oliver@neukum.name] wrote:
> > Am Donnerstag, 16. Januar 2003 21:48 schrieb Mike Anderson:
> > > David Brownell [david-b@pacbell.net] wrote:
> > > > Have any of the SCSI people been looking much at SCSI hotplug on 2.5?
> > > > I attach "/etc/hotplug/scsi.agent" from one of my desktops; all it
> > > > does is make sure the right drivers are loaded, it doesn't have a
> > > > clue yet about whether/how/where to mount disks or do other stuff.
> > >
> > > SCSI has added newer interfaces for host drivers to use that allow
> > > single adds and removes. The api document is located in
> > > /Documentation/scsi/scsi_mid_low_api.txt. I believe it is fairly
> > > current.
> >
> > This is good news in principle.
> > But what use is a function like scsi_remove_host() if it can fail?
> > If a device is gone, it is gone and all the complaining in the world
> > won't alter that.
> > Could you explain how a LLD can ensure that this function will always
> > succeed?
>
> The interface was focused on clean removes. There may be layers above
> the SCSI subsystem using the device that may need user space
> intervention to release. Currently the request_queue exported to the
> block layer is tied to the scsi device and cannot dis-associated.
>
> (NOTE: in looking at this in 2.5.58 it appears that there maybe some
> checks missing in scsi_check_device_busy that would make
> scsi_remove_host succeed more often than it should).

That is simply wrong. Reporting somebody having pulled a plug must
not fail. What are you supposed to do with an error here?

There must be a way for a LLD to report that reliably.
If the answer is, take that lock, call that function, error all pending
requests, release that lock and call that function, it's OK.

But it must work in all cases.

> If the device disappears the host needs to return all IOs in flight with
> an error, and  we need to ensure new ones do not start ( setting the
> device offline and maybe host_self_blocked). This is a start on getting
> the scsi_remove_host to succed.

Could you provide some details?

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-17 10:55                   ` Oliver Neukum
@ 2003-01-17 15:06                     ` Alan Stern
  2003-01-17 18:54                     ` Matthew Dharm
  1 sibling, 0 replies; 106+ messages in thread
From: Alan Stern @ 2003-01-17 15:06 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Mike Anderson, David Brownell, Greg KH, linux-usb-devel,
	Linux SCSI list

On Fri, 17 Jan 2003, Oliver Neukum wrote:

> Am Freitag, 17. Januar 2003 09:50 schrieb Mike Anderson:
> >
> > If the device disappears the host needs to return all IOs in flight with
> > an error, and  we need to ensure new ones do not start ( setting the
> > device offline and maybe host_self_blocked). This is a start on getting
> > the scsi_remove_host to succed.
> 
> Could you provide some details?

Usb-storage does most of this already.  When the device is unplugged, 
all current transactions return an error and any new ones return a 
device-not-ready code.  But that doesn't solve the problem of removing the 
device's representation within SCSI and sysfs.

Alan Stern


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-17 10:55                   ` Oliver Neukum
  2003-01-17 15:06                     ` Alan Stern
@ 2003-01-17 18:54                     ` Matthew Dharm
  2003-01-17 20:25                       ` Mike Anderson
                                         ` (2 more replies)
  1 sibling, 3 replies; 106+ messages in thread
From: Matthew Dharm @ 2003-01-17 18:54 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Mike Anderson, David Brownell, Greg KH, linux-usb-devel,
	Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 1766 bytes --]

On Fri, Jan 17, 2003 at 11:55:36AM +0100, Oliver Neukum wrote:
> That is simply wrong. Reporting somebody having pulled a plug must
> not fail. What are you supposed to do with an error here?
> 
> There must be a way for a LLD to report that reliably.
> If the answer is, take that lock, call that function, error all pending
> requests, release that lock and call that function, it's OK.
> 
> But it must work in all cases.

I absolutely agree.  The device is gone.  I can't do anything about it.
If the SCSI layer decides it can't let go, what am I supposed to do about
it?

In a separate discussion with Mike, he mentioned that you can't
scsi_remove_device() unless there are no pending commands.

How the hell is an LLD supposed to assure that!?!?

The minute I error a command and call scsi_done(), I can get a new one.
Unless I lock out requests with scsi_block_requests(), but that comes with
major warnings about needing to get unblocked.

The way this should work is that the LLD calls scsi_remove_device(), and
that cuts off the flow of commands.  The LLD can promise to error-out any
pending commands in the device command queue.

That is, unless scsi_block_requests() and scsi_unblock_requests() are more
useful than the documentation suggests... block(), error all commands, 
unregister()... that would make some sense.  We could call
scsi_block_request() as soon as we know the unit is gone, and unregister()
as soon as the queue is empty.

Matt

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

A:  The most ironic oxymoron wins ...
DP: "Microsoft Works"
A:  Uh, okay, you win.
					-- A.J. & Dust Puppy
User Friendly, 1/18/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-17 18:54                     ` Matthew Dharm
@ 2003-01-17 20:25                       ` Mike Anderson
  2003-01-17 22:07                         ` Oliver Neukum
  2003-01-17 20:26                       ` [linux-usb-devel] " Oliver Neukum
  2003-01-20 17:36                       ` Luben Tuikov
  2 siblings, 1 reply; 106+ messages in thread
From: Mike Anderson @ 2003-01-17 20:25 UTC (permalink / raw)
  To: Oliver Neukum, David Brownell, Greg KH, linux-usb-devel,
	Linux SCSI list

Oliver and Alan I am trying to catch up on this thread so I did not
reply directly to your concerns, but I think they are covered below.

Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote:
> On Fri, Jan 17, 2003 at 11:55:36AM +0100, Oliver Neukum wrote:
> > That is simply wrong. Reporting somebody having pulled a plug must
> > not fail. What are you supposed to do with an error here?
> > 
> > There must be a way for a LLD to report that reliably.
> > If the answer is, take that lock, call that function, error all pending
> > requests, release that lock and call that function, it's OK.
> > 
> > But it must work in all cases.
> 
> I absolutely agree.  The device is gone.  I can't do anything about it.
> If the SCSI layer decides it can't let go, what am I supposed to do about
> it?
> 
> In a separate discussion with Mike, he mentioned that you can't
> scsi_remove_device() unless there are no pending commands.
> 
> How the hell is an LLD supposed to assure that!?!?
> 

I believe that the scsi_remove_host function the way it is currently is
not the correct function. The SCSI needs to separate the device gone
from freeing. There maybe some unbounded cleanup as the request_queue
that is exported to the block layer is part of the scsi_device which is
a child of the virtual usb SCSI adapter. The only way to reduce the
unbounded time is possibly we reorganizing some sysfs tree object
layouts.

> The minute I error a command and call scsi_done(), I can get a new one.
> Unless I lock out requests with scsi_block_requests(), but that comes with
> major warnings about needing to get unblocked.
> 

Well in the case of the device really being gone does the LLD need to be
worried about being unblocked. I get the feeling from this thread that
this is probably the wrong interface.

> The way this should work is that the LLD calls scsi_remove_device(), and
> that cuts off the flow of commands.  The LLD can promise to error-out any
> pending commands in the device command queue.
> 
> That is, unless scsi_block_requests() and scsi_unblock_requests() are more
> useful than the documentation suggests... block(), error all commands, 
> unregister()... that would make some sense.  We could call
> scsi_block_request() as soon as we know the unit is gone, and unregister()
> as soon as the queue is empty.


We should really ensure that we have good separation between stopping device
IO, device gone, and release resources.

	- SCSI seems to have the flags to stop the IO, but instead of
	  scsi_block_requests we may want to export the setting of
	  device online. This can be done from sysfs now, but
	  not from the driver ( the driver does have a handle to the
	  device, but it would be better to have an interface in case we
	  need to do something addition operations).

	- Possibly add a scsi_remove_device that would always succeed and
	  a version of scsi_remove_host that calls scsi_remove_device for
	  all devices. Though with the recent change to SCSI remove host to
	  allow non sysfs device registration I do not believe we could
	  ensure devices would be cleaned up.

	- SCSI would need to support ref counting so that resources are
	  not removed to soon.


-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [PATCH] USB changes for 2.5.58
  2003-01-17 20:25                       ` Mike Anderson
@ 2003-01-17 22:07                         ` Oliver Neukum
  0 siblings, 0 replies; 106+ messages in thread
From: Oliver Neukum @ 2003-01-17 22:07 UTC (permalink / raw)
  To: Mike Anderson, David Brownell, Greg KH, linux-usb-devel,
	Linux SCSI list


> > The way this should work is that the LLD calls scsi_remove_device(), and
> > that cuts off the flow of commands.  The LLD can promise to error-out any
> > pending commands in the device command queue.
> >
> > That is, unless scsi_block_requests() and scsi_unblock_requests() are
> > more useful than the documentation suggests... block(), error all
> > commands, unregister()... that would make some sense.  We could call
> > scsi_block_request() as soon as we know the unit is gone, and
> > unregister() as soon as the queue is empty.
>
> We should really ensure that we have good separation between stopping
> device IO, device gone, and release resources.

Very good.

> 	- SCSI seems to have the flags to stop the IO, but instead of
> 	  scsi_block_requests we may want to export the setting of
> 	  device online. This can be done from sysfs now, but

Yes, extremely good, we _need_ this.

> 	  not from the driver ( the driver does have a handle to the
> 	  device, but it would be better to have an interface in case we
> 	  need to do something addition operations).
>
> 	- Possibly add a scsi_remove_device that would always succeed and
> 	  a version of scsi_remove_host that calls scsi_remove_device for
> 	  all devices. Though with the recent change to SCSI remove host to
> 	  allow non sysfs device registration I do not believe we could
> 	  ensure devices would be cleaned up.

Meaning? If memory stays tied up, we have a beautiful DOS attack,
if disconnection can be faked by software.

	Regards
		Oliver



-------------------------------------------------------
This SF.NET email is sponsored by: Thawte.com - A 128-bit supercerts will
allow you to extend the highest allowed 128 bit encryption to all your
clients even if they use browsers that are limited to 40 bit encryption.
Get a guide here:http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0030en
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-17 18:54                     ` Matthew Dharm
  2003-01-17 20:25                       ` Mike Anderson
@ 2003-01-17 20:26                       ` Oliver Neukum
  2003-01-17 20:49                         ` Mike Anderson
  2003-01-20 17:36                       ` Luben Tuikov
  2 siblings, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-17 20:26 UTC (permalink / raw)
  To: Matthew Dharm
  Cc: Mike Anderson, David Brownell, Greg KH, linux-usb-devel,
	Linux SCSI list


> In a separate discussion with Mike, he mentioned that you can't
> scsi_remove_device() unless there are no pending commands.
>
> How the hell is an LLD supposed to assure that!?!?
>
> The minute I error a command and call scsi_done(), I can get a new one.
> Unless I lock out requests with scsi_block_requests(), but that comes with
> major warnings about needing to get unblocked.

If I understand the scsi code correctly, doing that will result in a memory
leak at least. Perhaps exporting a function to declare a host's devices
offline might do the trick. But as yet I havn't found out where the scsi layer
actually checks that flag.

> The way this should work is that the LLD calls scsi_remove_device(), and
> that cuts off the flow of commands.  The LLD can promise to error-out any
> pending commands in the device command queue.
>
> That is, unless scsi_block_requests() and scsi_unblock_requests() are more
> useful than the documentation suggests... block(), error all commands,
> unregister()... that would make some sense.  We could call
> scsi_block_request() as soon as we know the unit is gone, and unregister()
> as soon as the queue is empty.

Sounds reasonable.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-17 20:26                       ` [linux-usb-devel] " Oliver Neukum
@ 2003-01-17 20:49                         ` Mike Anderson
  0 siblings, 0 replies; 106+ messages in thread
From: Mike Anderson @ 2003-01-17 20:49 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Matthew Dharm, David Brownell, Greg KH, linux-usb-devel,
	Linux SCSI list

Oliver Neukum [oliver@neukum.name] wrote:
> 
> > In a separate discussion with Mike, he mentioned that you can't
> > scsi_remove_device() unless there are no pending commands.
> >
> > How the hell is an LLD supposed to assure that!?!?
> >
> > The minute I error a command and call scsi_done(), I can get a new one.
> > Unless I lock out requests with scsi_block_requests(), but that comes with
> > major warnings about needing to get unblocked.
> 
> If I understand the scsi code correctly, doing that will result in a memory
> leak at least. Perhaps exporting a function to declare a host's devices
> offline might do the trick. But as yet I havn't found out where the scsi layer
> actually checks that flag.

It is returned by scsi_block_when_processing_errors (upper level drivers
opens, ioctl, etc). It is checked in scsi_decide_disposition the
scsi_softirq / scsi_done side. It is checked in the command init of the
upper level drivers during scsi_prep_fn.

-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-17 18:54                     ` Matthew Dharm
  2003-01-17 20:25                       ` Mike Anderson
  2003-01-17 20:26                       ` [linux-usb-devel] " Oliver Neukum
@ 2003-01-20 17:36                       ` Luben Tuikov
  2003-01-20 18:23                         ` Oliver Neukum
  2003-01-20 20:08                         ` David Brownell
  2 siblings, 2 replies; 106+ messages in thread
From: Luben Tuikov @ 2003-01-20 17:36 UTC (permalink / raw)
  To: Matthew Dharm
  Cc: Oliver Neukum, Mike Anderson, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

Matthew Dharm wrote:
> In a separate discussion with Mike, he mentioned that you can't
> scsi_remove_device() unless there are no pending commands.
> 
> How the hell is an LLD supposed to assure that!?!?

ABORT TASK/ABORT TASK SET.

For a year now I've been trying to get something of the sort
scsi_cancel_task/set().  It will send the aforementioned task
management functions (TMF) (depending on the abilities of the device)
to the device server (LLD).  After which the initiator should
NOT get a response to any already queued commands in the LLD.
LLDD should be smarter if they do their own queuing and snoop this
and act accordingly.

After sending such a TMF to the LLD, one can clean all ULP queues
(scsi, block, etc), knowing that there'd be no response to
a command (which is now gone), and then actually remove
the device.

In my own drivers/mini-scsi-core, I do something like this:
1. mark the device off (stop queuing anything to it, return error or
    whatever),
2. send the aforementioned TMF,
    2a) wait for current transfers to complete
3. cancel ULP queues.

Now the device is cleanly off, and one can remove it/restart it/whatever.

Note that this method is cleanly reversible (1. turn on,
2. LUN/device RESET (scsi layer), 3. start queuing (block layer)).

(Note as well that I make distinction between LLD and LLDD, where the last D
stands for ``driver'' in LLDD.)

> The way this should work is that the LLD calls scsi_remove_device(), and
> that cuts off the flow of commands.  The LLD can promise to error-out any
> pending commands in the device command queue.

I take it you mean that the transport will tell the LLDD that the device
is gone and it (LLDD) call the one above, SCSI Core to remove the device.

Hmm, more thinking needs to be done here, as shouldn't this be handled
by hotplugging? I.e. Targets do not *initiate* events.

The transport can notify that the device is gone, but an ULP entity will
call scsi_remove_device() not the other way around.

-- 
Luben

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 17:36                       ` Luben Tuikov
@ 2003-01-20 18:23                         ` Oliver Neukum
  2003-01-20 18:56                           ` Luben Tuikov
  2003-01-21  3:31                           ` Alan
  2003-01-20 20:08                         ` David Brownell
  1 sibling, 2 replies; 106+ messages in thread
From: Oliver Neukum @ 2003-01-20 18:23 UTC (permalink / raw)
  To: Luben Tuikov, Matthew Dharm
  Cc: Mike Anderson, David Brownell, Greg KH, linux-usb-devel,
	Linux SCSI list

> I take it you mean that the transport will tell the LLDD that the device
> is gone and it (LLDD) call the one above, SCSI Core to remove the device.
>
> Hmm, more thinking needs to be done here, as shouldn't this be handled
> by hotplugging? I.e. Targets do not *initiate* events.
>
> The transport can notify that the device is gone, but an ULP entity will
> call scsi_remove_device() not the other way around.

NO!

This is an insanely complicated scheme.
We have no notification beforehand. User yanks out cable.
That's it. No preperation at all.

We as the writers of device drivers need a way to get rid of the device
as we are notified of the physical disconnect. It is not our job to maintain
devices in an undead state.

And a scheme that goes subsystem driver -> hotplugging -> script finding
corresponding devices -> script doing proc magic -> scsi layer notifying
low level driver is _not_ sensible. It triples the amount of complexity.

We need a simple scheme like
1. block further requests
2. kill old requests
3. remove device

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [PATCH] USB changes for 2.5.58
  2003-01-20 18:23                         ` Oliver Neukum
@ 2003-01-20 18:56                           ` Luben Tuikov
  2003-01-20 19:10                             ` [linux-usb-devel] " Oliver Neukum
  2003-01-20 19:50                             ` David Brownell
  2003-01-21  3:31                           ` Alan
  1 sibling, 2 replies; 106+ messages in thread
From: Luben Tuikov @ 2003-01-20 18:56 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Matthew Dharm, Mike Anderson, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

Oliver Neukum wrote:
>>I take it you mean that the transport will tell the LLDD that the device
>>is gone and it (LLDD) call the one above, SCSI Core to remove the device.
>>
>>Hmm, more thinking needs to be done here, as shouldn't this be handled
>>by hotplugging? I.e. Targets do not *initiate* events.
>>
>>The transport can notify that the device is gone, but an ULP entity will
>>call scsi_remove_device() not the other way around.
> 
> 
> NO!

We're probably talking about two different things.

> This is an insanely complicated scheme.
> We have no notification beforehand. User yanks out cable.
> That's it. No preperation at all.

So now you have two possibilities:
	a) the transport supports this event notification,
	b) the transport doesn't support this event notification.

Let me just elaborate a bit more here: since the transport/LLDD would
know that the device has just disappeared (a)), it *will* return error
in the due time when someone is trying to use it (and this is the
same error as if there had never been such a device).

But a removal of a device would probably have to start in top-down
approach, to free/release/etc resources/etc, rather than a bottom-up
approach (I just cannot see how this would work...)

In fact, this is the whole point of hotplugging, as there may be
other closely related things which would have to be done.

> We as the writers of device drivers need a way to get rid of the device
> as we are notified of the physical disconnect.

Yes, and as I explained earlier: you *may* get notified by the transport.

 > It is not our job to maintain devices in an undead state.

Yes, Linus has said this here before and it's pointless for you to
repeat it here.  Furthermore, nothing I've said suggests that.

Everyone agrees with this.

> And a scheme that goes subsystem driver -> hotplugging -> script finding
> corresponding devices -> script doing proc magic -> scsi layer notifying
> low level driver is _not_ sensible. It triples the amount of complexity.
> 
> We need a simple scheme like
> 1. block further requests
> 2. kill old requests
> 3. remove device

But there's nothing non-trivial about this scheme -- i.e. it's no
brainer -- everyone knows that *those* are the minimum set of steps.

Q: who initiates step 1?

If it's the user, then we're wasting time discussing trivialities
here (or maybe we're showing that we're working :-)) ).
But if it's not the user, then...

In general, I wasn't really discussing hotplugging, I was basically
hinting that SCSI Core could use a few more functionalities, and that
some things need to go away.

-- 
Luben





-------------------------------------------------------
This SF.NET email is sponsored by: FREE  SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your  SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 18:56                           ` Luben Tuikov
@ 2003-01-20 19:10                             ` Oliver Neukum
  2003-01-20 19:50                             ` David Brownell
  1 sibling, 0 replies; 106+ messages in thread
From: Oliver Neukum @ 2003-01-20 19:10 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Matthew Dharm, Mike Anderson, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list


> > We as the writers of device drivers need a way to get rid of the device
> > as we are notified of the physical disconnect.
>
> Yes, and as I explained earlier: you *may* get notified by the transport.

That exactly is the point. There must be no maybe. Gone is gone.
Failure is not an option here.
Only LLDD knows reliably whether a device is gone and there can
be no second guessing by higher levels. Consequently, the LLDD
reports it and the higher layers delete the device as gently as possible,
but delete it in any case.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 18:56                           ` Luben Tuikov
  2003-01-20 19:10                             ` [linux-usb-devel] " Oliver Neukum
@ 2003-01-20 19:50                             ` David Brownell
  1 sibling, 0 replies; 106+ messages in thread
From: David Brownell @ 2003-01-20 19:50 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Oliver Neukum, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Luben Tuikov wrote:

> But a removal of a device would probably have to start in top-down
> approach, to free/release/etc resources/etc, rather than a bottom-up
> approach (I just cannot see how this would work...)

Well, the driver model core covers key parts of that.  Or it was
at least intended to ... in fact, today I think the bus drivers
(like USB and PCI) need to own the top-down logic for cases like
resume and hub/bridge/adapter (dis)connect.  (And their bottom-up
analogues for suspension.)

> In general, I wasn't really discussing hotplugging, I was basically
> hinting that SCSI Core could use a few more functionalities, and that
> some things need to go away.

There does seem to be some consensus on that point, which is
a good place to start!

- Dave






^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 18:23                         ` Oliver Neukum
  2003-01-20 18:56                           ` Luben Tuikov
@ 2003-01-21  3:31                           ` Alan
  2003-01-21  7:17                             ` Oliver Neukum
  2003-01-21 13:30                             ` James Bottomley
  1 sibling, 2 replies; 106+ messages in thread
From: Alan @ 2003-01-21  3:31 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, David Brownell,
	Greg KH, linux-usb-devel, Linux SCSI list

On Mon, 2003-01-20 at 18:23, Oliver Neukum wrote:
> > The transport can notify that the device is gone, but an ULP entity will
> > call scsi_remove_device() not the other way around.
> 
> NO!
> 
> This is an insanely complicated scheme.
> We have no notification beforehand. User yanks out cable.
> That's it. No preperation at all.
> 
> We as the writers of device drivers need a way to get rid of the device
> as we are notified of the physical disconnect. It is not our job to maintain
> devices in an undead state.

If you think about it rationally there isnt a lot that can be done higher
up. At the point the hardware vanishes there may be other threads of execution
already in your driver, so undead state is a reality you have to live with, at
least briefly.

Providing you refcount objects and defer freeing of resources its not normally
too terrible 


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [PATCH] USB changes for 2.5.58
  2003-01-21  3:31                           ` Alan
@ 2003-01-21  7:17                             ` Oliver Neukum
  2003-01-21 11:57                               ` [linux-usb-devel] " Douglas Gilbert
  2003-01-21 13:30                             ` James Bottomley
  1 sibling, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-21  7:17 UTC (permalink / raw)
  To: Alan
  Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, David Brownell,
	Greg KH, linux-usb-devel, Linux SCSI list

Am Dienstag, 21. Januar 2003 04:31 schrieb Alan:
> On Mon, 2003-01-20 at 18:23, Oliver Neukum wrote:
> > > The transport can notify that the device is gone, but an ULP entity
> > > will call scsi_remove_device() not the other way around.
> >
> > NO!
> >
> > This is an insanely complicated scheme.
> > We have no notification beforehand. User yanks out cable.
> > That's it. No preperation at all.
> >
> > We as the writers of device drivers need a way to get rid of the device
> > as we are notified of the physical disconnect. It is not our job to
> > maintain devices in an undead state.
>
> If you think about it rationally there isnt a lot that can be done higher
> up. At the point the hardware vanishes there may be other threads of
> execution already in your driver, so undead state is a reality you have to
> live with, at least briefly.

No problem with that. I have a problem with notifying the SCSI layer
and then waiting for an unlimited time until maybe the SCSI layer decides
to inform me of a success. You see, disconnection has to work.
Having to wait for an unlimited time is a kind of failure.

I simply don't trust the SCSI layer. I've had to much trouble with it
already.

	Regards
		Oliver



-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-21  7:17                             ` Oliver Neukum
@ 2003-01-21 11:57                               ` Douglas Gilbert
  2003-01-21 13:48                                 ` Oliver Neukum
  0 siblings, 1 reply; 106+ messages in thread
From: Douglas Gilbert @ 2003-01-21 11:57 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Alan, Luben Tuikov, Matthew Dharm, Mike Anderson, David Brownell,
	Greg KH, linux-usb-devel, Linux SCSI list

Oliver Neukum wrote:
 > <snip/>
 >
> I simply don't trust the SCSI layer. I've had to much trouble with it
> already.

Hopefully we have some better building blocks and
clearer code in 2.5 to look at this problem in the SCSI
subsystem again.

The first thing a LLDD should do when it _knows_ the device
is gone is set scsi_device::online=0 ** which should stop all
new commands being queued. Now if scsi_device::access_count
is zero then we have no problems *** and most of the code
we need is in place in latter half of scsi_remove_single_device().

The hard case is when scsi_device::access_count>0 which means
open()s or mounts are active on that device. So sd, sr, st, osst
and/or sg know about a file descriptor (or the block equivalent)
that is associated with that "departed" scsi_device instance. I
have code in sg in lk 2.4 to partially handle the case when
detach is called on a device for which sg holds an open fd.
Sg can handle this because it shadows scsi_device and
scsi_cmnd instances. The next time an app tries to access
that fd it gets a ENODEV (even sends out a SIGIO/POLL_HUP for
advanced apps). I suspect life would not be so simple for sd
and sr due to their close binding with the block subsystem.
Another approach to this problem is to keep the scsi_device
instances for departed devices around until the access_count
drops to zero. One silly idea I had was to change the seldom
used channel number to 1024 (or the next free number above that)
to maintain the uniqueness of the host/channel/id/lun tuple ****
and keep the original tuple available for the re-appearance
of the departed device.

Compounding the hard case is when commands are queued. Can
these simply be ENODEV-ed back to the apps that own them?
If so, that may help the access_count drop to zero facilitating
scsi_device removal.

Question: do we need to worry about hot unplugging of hosts?

** Since 'online' already has a usage (i.e. error recovery
couldn't resurrect this device) perhaps something stronger
like 'departed' is needed as well (or some sysfs mechanism).
BTW sg sidesteps 'online' when a file descriptor is opened
non-blocking.

*** As Oliver has pointed out to me before, there are still
opportunities for races when access_count is zero and an
open()/mount is about to happen.

**** We need a new, enhanced version of the
SCSI_IOCTL_GET_IDLUN ioctl which as it stands can only
convey 8 bit quantities for host, channel, target and lun;
especially if we go to 64 bit luns.

... just some random thoughts ... fire away

Doug Gilbert

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-21 11:57                               ` [linux-usb-devel] " Douglas Gilbert
@ 2003-01-21 13:48                                 ` Oliver Neukum
  2003-01-21 18:22                                   ` Luben Tuikov
  0 siblings, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-21 13:48 UTC (permalink / raw)
  To: dougg
  Cc: Alan, Luben Tuikov, Matthew Dharm, Mike Anderson, David Brownell,
	Greg KH, linux-usb-devel, Linux SCSI list

Am Dienstag, 21. Januar 2003 12:57 schrieb Douglas Gilbert:
> Oliver Neukum wrote:
>  > <snip/>
> >
> > I simply don't trust the SCSI layer. I've had to much trouble with it
> > already.
>
> Hopefully we have some better building blocks and
> clearer code in 2.5 to look at this problem in the SCSI
> subsystem again.

I surely hope so. Let's discuss it.

> The first thing a LLDD should do when it _knows_ the device
> is gone is set scsi_device::online=0 ** which should stop all

The SCSI should export functions to do that and to do it to all
a host's devices.

> new commands being queued. Now if scsi_device::access_count
> is zero then we have no problems *** and most of the code
> we need is in place in latter half of scsi_remove_single_device().

Yes. And scsi_device_get() should check for a device having been
unplugged.

> The hard case is when scsi_device::access_count>0 which means
> open()s or mounts are active on that device. So sd, sr, st, osst
> and/or sg know about a file descriptor (or the block equivalent)

What is the block equivalent?

> that is associated with that "departed" scsi_device instance. I
> have code in sg in lk 2.4 to partially handle the case when
> detach is called on a device for which sg holds an open fd.
> Sg can handle this because it shadows scsi_device and
> scsi_cmnd instances. The next time an app tries to access
> that fd it gets a ENODEV (even sends out a SIGIO/POLL_HUP for
> advanced apps). I suspect life would not be so simple for sd
> and sr due to their close binding with the block subsystem.
> Another approach to this problem is to keep the scsi_device
> instances for departed devices around until the access_count
> drops to zero. One silly idea I had was to change the seldom

That looks like a workable approach.

> used channel number to 1024 (or the next free number above that)
> to maintain the uniqueness of the host/channel/id/lun tuple ****
> and keep the original tuple available for the re-appearance
> of the departed device.
>
> Compounding the hard case is when commands are queued. Can
> these simply be ENODEV-ed back to the apps that own them?
> If so, that may help the access_count drop to zero facilitating
> scsi_device removal.

The LLDD can certainly return an error for the commands already queued.

> Question: do we need to worry about hot unplugging of hosts?

Yes, definitely yes. A USB storage device is a virtual host, since
scsi ids would otherwise collide. Besides, PCMCIA SCSI host are
not exactly brandnew either.

> ** Since 'online' already has a usage (i.e. error recovery
> couldn't resurrect this device) perhaps something stronger
> like 'departed' is needed as well (or some sysfs mechanism).
> BTW sg sidesteps 'online' when a file descriptor is opened
> non-blocking.

Good idea.

So a disconnection would look like this:

scsi_set_offline_host(...);
synchronize_kernel();
error_queied_commands(...);
scsi_remove_host();

	Regards
		Oliver



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-21 13:48                                 ` Oliver Neukum
@ 2003-01-21 18:22                                   ` Luben Tuikov
  0 siblings, 0 replies; 106+ messages in thread
From: Luben Tuikov @ 2003-01-21 18:22 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: dougg, Alan, Matthew Dharm, Mike Anderson, David Brownell,
	Greg KH, linux-usb-devel, Linux SCSI list

Oliver Neukum wrote:
 >
> So a disconnection would look like this:
> 
1.
> scsi_set_offline_host(...);
2.
> synchronize_kernel();
3.
> error_queied_commands(...);
4.
> scsi_remove_host();

Not quite.  You want to do 3 before 2, to get 2 going as soon
as possible.  Futhermore, 3 is partly in LLDD.  1 would
take care of 3 in SCSI Core.  See my previous (by date) post.

-- 
Luben




^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-21  3:31                           ` Alan
  2003-01-21  7:17                             ` Oliver Neukum
@ 2003-01-21 13:30                             ` James Bottomley
  1 sibling, 0 replies; 106+ messages in thread
From: James Bottomley @ 2003-01-21 13:30 UTC (permalink / raw)
  To: Alan
  Cc: Oliver Neukum, Luben Tuikov, Matthew Dharm, Mike Anderson,
	David Brownell, Greg KH, linux-usb-devel, Linux SCSI list

alan@lxorguk.ukuu.org.uk said:
> Providing you refcount objects and defer freeing of resources its not
> normally too terrible  

We already have a struct device, which is a ref counted object precisely for 
this purpose, embedded inside Scsi_Device.

One of the issues doing this is fast reattachment: the device goes away then 
comes back before we've cleared the outstanding command queue.

In the latter case, I think we could use some of the work Luben Tuikov has 
been doing to make the cmnd structures tie more closely to the device:  as 
long as we remove the device from user visibility as soon as it is removed, we 
can keep the Scsi_Device object around until the ref count falls to zero, and 
even create a new one while this is going on.  Commands attached to the old 
device still error out.

It would still be nice if the trigger for Scsi_device removal came from a USB 
hotplug event at user level, but I'm not too bothered about that.

James

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [PATCH] USB changes for 2.5.58
  2003-01-20 17:36                       ` Luben Tuikov
  2003-01-20 18:23                         ` Oliver Neukum
@ 2003-01-20 20:08                         ` David Brownell
  2003-01-20 20:48                           ` [linux-usb-devel] " Oliver Neukum
  2003-01-20 22:16                           ` Luben Tuikov
  1 sibling, 2 replies; 106+ messages in thread
From: David Brownell @ 2003-01-20 20:08 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Matthew Dharm, Oliver Neukum, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Luben Tuikov wrote:

>> The way this should work is that the LLD calls scsi_remove_device(), and
>> that cuts off the flow of commands.  The LLD can promise to error-out any
>> pending commands in the device command queue.
> 
> 
> I take it you mean that the transport will tell the LLDD that the device
> is gone and it (LLDD) call the one above, SCSI Core to remove the device.
> 
> Hmm, more thinking needs to be done here, as shouldn't this be handled
> by hotplugging? I.e. Targets do not *initiate* events.

Not exactly, but the bus driver ("transport"?) certainly does initiate
reports like "here's a new device on the bus" or "that device is gone".
That's when hotplugging kicks in (both in-kernel and in-userland).

And the only way to access a device ("target") on the bus is to give a
request to that bus driver.  If, when servicing that request, the bus
driver notices the device is gone ... that can act a lot like a device
initiating a "device gone" event would look.

> The transport can notify that the device is gone, but an ULP entity will
> call scsi_remove_device() not the other way around.

That's how USB works today:  khubd shuts things down.  Device drivers
get disconnect() callbacks, just as when their modules are removed.

EXCEPT that "khubd" is part of usbcore (roughly analagous to parts
of the scsi mid-layer) ... so the drivers acting as host side proxies
for the target hardware ("usb device") are purely reactive.  Their
only roles in hotplug scenarios are to bind to devices (when a new
one appears, using probe callbacks) or unbind from them (when one
goes away, using disconnect callbacks).

Those disconnect() callbacks have a few key responsibilities, very
much including shutting down the entire higher level I/O queue to
that device.  I think you're saying that SCSI drivers don't have
such a responsibility (unlike USB or PCI) ... if so, that would
seem to be worth changing.

- Dave

-------------------------------------------------------
This SF.NET email is sponsored by: FREE  SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your  SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 20:08                         ` David Brownell
@ 2003-01-20 20:48                           ` Oliver Neukum
  2003-01-20 21:24                             ` David Brownell
  2003-01-20 22:16                           ` Luben Tuikov
  1 sibling, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-20 20:48 UTC (permalink / raw)
  To: David Brownell, Luben Tuikov
  Cc: Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel,
	Linux SCSI list

Am Montag, 20. Januar 2003 21:08 schrieb David Brownell:
> Luben Tuikov wrote:
> >> The way this should work is that the LLD calls scsi_remove_device(), and
> >> that cuts off the flow of commands.  The LLD can promise to error-out
> >> any pending commands in the device command queue.
> >
> > I take it you mean that the transport will tell the LLDD that the device
> > is gone and it (LLDD) call the one above, SCSI Core to remove the device.
> >
> > Hmm, more thinking needs to be done here, as shouldn't this be handled
> > by hotplugging? I.e. Targets do not *initiate* events.
>
> Not exactly, but the bus driver ("transport"?) certainly does initiate
> reports like "here's a new device on the bus" or "that device is gone".
> That's when hotplugging kicks in (both in-kernel and in-userland).
>
> And the only way to access a device ("target") on the bus is to give a
> request to that bus driver.  If, when servicing that request, the bus
> driver notices the device is gone ... that can act a lot like a device
> initiating a "device gone" event would look.

Correct. As a LLDD is the lowest layer, these are equivalent thing.
Only a LLDD can positively detect a device or a bus going away.

> > The transport can notify that the device is gone, but an ULP entity will
> > call scsi_remove_device() not the other way around.
>
> That's how USB works today:  khubd shuts things down.  Device drivers
> get disconnect() callbacks, just as when their modules are removed.
>
> EXCEPT that "khubd" is part of usbcore (roughly analagous to parts
> of the scsi mid-layer) ... so the drivers acting as host side proxies
> for the target hardware ("usb device") are purely reactive.  Their
> only roles in hotplug scenarios are to bind to devices (when a new
> one appears, using probe callbacks) or unbind from them (when one
> goes away, using disconnect callbacks).

That model cannot be applied to SCSI as it is much more diverse
in the number of bus types it supports.
USB can do it, because it knows about hubs. SCSI cannot,
as there are no hubs in SCSI.

> Those disconnect() callbacks have a few key responsibilities, very
> much including shutting down the entire higher level I/O queue to
> that device.  I think you're saying that SCSI drivers don't have
> such a responsibility (unlike USB or PCI) ... if so, that would
> seem to be worth changing.

If the scsi layer cannot on its own detect that a device or a bus is gone,
there'll be no sense in having a callback. It's just a complication.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [PATCH] USB changes for 2.5.58
  2003-01-20 20:48                           ` [linux-usb-devel] " Oliver Neukum
@ 2003-01-20 21:24                             ` David Brownell
  2003-01-20 21:51                               ` [linux-usb-devel] " Oliver Neukum
  0 siblings, 1 reply; 106+ messages in thread
From: David Brownell @ 2003-01-20 21:24 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Oliver Neukum wrote:

> That model cannot be applied to SCSI as it is much more diverse
> in the number of bus types it supports.
> USB can do it, because it knows about hubs. SCSI cannot,
> as there are no hubs in SCSI.

Hubs are irrelevant here, the key functionality is noticing
hardware addition/disconnect.  Parts of it can be done in bus
adapter code, parts of it can't.  SCSI probes LUNS in much
the same way khubd probes hub ports, and as I recall most of
that logic isn't any more specific to the adapter than virtual
root hub code is for USB.


>>Those disconnect() callbacks have a few key responsibilities, very
>>much including shutting down the entire higher level I/O queue to
>>that device.  I think you're saying that SCSI drivers don't have
>>such a responsibility (unlike USB or PCI) ... if so, that would
>>seem to be worth changing.
> 
> 
> If the scsi layer cannot on its own detect that a device or a bus is gone,
> there'll be no sense in having a callback. It's just a complication.

Erm ... which of the three SCSI layers are you talking about?  I was
talking about the highest level, which is precisely the layer I think
has been identified as already needing to know when to shut down the
I/O queues (sd_mod and friends).

- Dave



-------------------------------------------------------
This SF.NET email is sponsored by: FREE  SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your  SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 21:24                             ` David Brownell
@ 2003-01-20 21:51                               ` Oliver Neukum
  2003-01-20 22:26                                 ` David Brownell
  0 siblings, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-20 21:51 UTC (permalink / raw)
  To: David Brownell
  Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Am Montag, 20. Januar 2003 22:24 schrieb David Brownell:
> Oliver Neukum wrote:
> > That model cannot be applied to SCSI as it is much more diverse
> > in the number of bus types it supports.
> > USB can do it, because it knows about hubs. SCSI cannot,
> > as there are no hubs in SCSI.
>
> Hubs are irrelevant here, the key functionality is noticing
> hardware addition/disconnect.  Parts of it can be done in bus
> adapter code, parts of it can't.  SCSI probes LUNS in much
> the same way khubd probes hub ports, and as I recall most of
> that logic isn't any more specific to the adapter than virtual
> root hub code is for USB.

I should be more specific.
The SCSI is different for several reasons:
- we are talking about bus as well as device detection
- there's no common way to probe for devices
  (the probing of LUNs works only on conventional busses)
- many SCSI devices are not really SCSI devices. They just use
  the command set

SCSI hotplug detection doesn't work for the same reason that USB
can handle only detection of devices by itself. Busses on the other
hand are not handled by USB itself.

> >>Those disconnect() callbacks have a few key responsibilities, very
> >>much including shutting down the entire higher level I/O queue to
> >>that device.  I think you're saying that SCSI drivers don't have
> >>such a responsibility (unlike USB or PCI) ... if so, that would
> >>seem to be worth changing.
> >
> > If the scsi layer cannot on its own detect that a device or a bus is
> > gone, there'll be no sense in having a callback. It's just a
> > complication.
>
> Erm ... which of the three SCSI layers are you talking about?  I was
> talking about the highest level, which is precisely the layer I think
> has been identified as already needing to know when to shut down the
> I/O queues (sd_mod and friends).

In SCSI view device and bus disconnection is recognised by the lowest
level. As it knows nothing about the high layers, it notifies the midlayer
which in turn notifies the high level drivers. What should a callback do?
The low level driver cannot do more than notify.
I don't see what the midlayer could do with a callback, but I defer judgement
here to the SCSI people, but definitely the LLDD has no use for a callback.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 21:51                               ` [linux-usb-devel] " Oliver Neukum
@ 2003-01-20 22:26                                 ` David Brownell
  2003-01-20 23:00                                   ` Oliver Neukum
  0 siblings, 1 reply; 106+ messages in thread
From: David Brownell @ 2003-01-20 22:26 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Oliver Neukum wrote:
> I should be more specific.

Yes...

> The SCSI is different for several reasons:
> - we are talking about bus as well as device detection
> - there's no common way to probe for devices
>   (the probing of LUNs works only on conventional busses)

So the SCSI stack needs to support more than one model for
device/bus detection.  This can't be news.  And some of them
have to handle "conventional" busses, like USB and PCI; maybe
even handle booting off them...

> - many SCSI devices are not really SCSI devices. They just use
>   the command set
> 
> SCSI hotplug detection doesn't work for the same reason that USB
> can handle only detection of devices by itself. Busses on the other
> hand are not handled by USB itself.

Just how is it that USB doesn't handle USB?  "B" == "Bus" ... :)

If you mean that HCs hook up to a different bus (often PCI), with its
own hotplug support, that doesn't seem so different from SCSI HBAs
hooking up to such busses (often PCI) and cascading the same hotplug
support...

>>Erm ... which of the three SCSI layers are you talking about?  I was
>>talking about the highest level,..
> 
> 
> In SCSI view device and bus disconnection is recognised by the lowest
> level. As it knows nothing about the high layers, it notifies the midlayer

So you were talking past what I said about notifying that highest level,
not disagreeing with it.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 22:26                                 ` David Brownell
@ 2003-01-20 23:00                                   ` Oliver Neukum
  2003-01-21  0:44                                     ` David Brownell
  0 siblings, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-20 23:00 UTC (permalink / raw)
  To: David Brownell
  Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list


> So the SCSI stack needs to support more than one model for
> device/bus detection.  This can't be news.  And some of them
> have to handle "conventional" busses, like USB and PCI; maybe
> even handle booting off them...

Until very recently it was news. And they still haven't fully
comprehended the implications.

> If you mean that HCs hook up to a different bus (often PCI), with its
> own hotplug support, that doesn't seem so different from SCSI HBAs
> hooking up to such busses (often PCI) and cascading the same hotplug
> support...

Right. Only that in SCSI it's that way for devices as well, not just busses.
Therefore removal detection and notification is a strict bottom to top
process.

> >>Erm ... which of the three SCSI layers are you talking about?  I was
> >>talking about the highest level,..
> >
> > In SCSI view device and bus disconnection is recognised by the lowest
> > level. As it knows nothing about the high layers, it notifies the
> > midlayer
>
> So you were talking past what I said about notifying that highest level,
> not disagreeing with it.

I was trying to make the point that callbacks have no place in that process.
It must go bottom to top and that's it. And there must be no error conditions
on the way. Refusing to take notice of a device removal is just not an option.
This is exactly what the current SCSI idea of an API to do bus removal does.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 23:00                                   ` Oliver Neukum
@ 2003-01-21  0:44                                     ` David Brownell
  2003-01-21  0:50                                       ` Oliver Neukum
  0 siblings, 1 reply; 106+ messages in thread
From: David Brownell @ 2003-01-21  0:44 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Oliver Neukum wrote:

>>So you were talking past what I said about notifying that highest level,
>>not disagreeing with it.
> 
> 
> I was trying to make the point that callbacks have no place in that process.

If so, you didn't persuade me ...

> It must go bottom to top and that's it.

... because those disconnect() callbacks are exactly how USB and PCI deliver
that notification to the "top" level, and you've already agreed that SCSI
needs to accomodate those models.  So clearly they have at least that much
of a place.

> 	 Refusing to take notice of a device removal is just not an option.
> This is exactly what the current SCSI idea of an API to do bus removal does.

I perceive violent agreement that a change is needed in that area.

But the next step there would seem to be a patch to the SCSI APIs,
unless I mis-understood what Matt was saying about the issue he ran
into when making usb-storage use the enumeration facilities in the
current SCSI mid/low layers.

- Dave

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-21  0:44                                     ` David Brownell
@ 2003-01-21  0:50                                       ` Oliver Neukum
  2003-01-21 18:16                                         ` Luben Tuikov
  0 siblings, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-21  0:50 UTC (permalink / raw)
  To: David Brownell
  Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Am Dienstag, 21. Januar 2003 01:44 schrieb David Brownell:
> Oliver Neukum wrote:
> >>So you were talking past what I said about notifying that highest level,
> >>not disagreeing with it.
> >
> > I was trying to make the point that callbacks have no place in that
> > process.
>
> If so, you didn't persuade me ...
>
> > It must go bottom to top and that's it.
>
> ... because those disconnect() callbacks are exactly how USB and PCI
> deliver that notification to the "top" level, and you've already agreed
> that SCSI needs to accomodate those models.  So clearly they have at least
> that much of a place.

Disconnect is not really a callback. There's a distinct lack of a back movement here.
khubd -> usbcore -> disconnect() in driver -> [layer on top]

The proposed API in SCSI looks like:
<bus system> -> LLD -> midlayer -> top layer -> midlayer -> LLD with destroy_slave()
and that's not OK.

> > 	 Refusing to take notice of a device removal is just not an option.
> > This is exactly what the current SCSI idea of an API to do bus removal
> > does.
>
> I perceive violent agreement that a change is needed in that area.
>
> But the next step there would seem to be a patch to the SCSI APIs,
> unless I mis-understood what Matt was saying about the issue he ran
> into when making usb-storage use the enumeration facilities in the
> current SCSI mid/low layers.

I have some horrible notions when I see what APIs grace the SCSI layer.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-21  0:50                                       ` Oliver Neukum
@ 2003-01-21 18:16                                         ` Luben Tuikov
  2003-01-21 19:00                                           ` Oliver Neukum
  2003-01-22 21:30                                           ` [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 David Brownell
  0 siblings, 2 replies; 106+ messages in thread
From: Luben Tuikov @ 2003-01-21 18:16 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: David Brownell, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Oliver Neukum wrote:
> 
> Disconnect is not really a callback. There's a distinct lack of a back movement here.
> khubd -> usbcore -> disconnect() in driver -> [layer on top]
> 
> The proposed API in SCSI looks like:
> <bus system> -> LLD -> midlayer -> top layer -> midlayer -> LLD with destroy_slave()
> and that's not OK.

No, not quite.

When the Low Level Device Driver (LLDD), being the transport portal,
notices that the device is going away or has gone away from the
``fabric'' (wlg), it will fire a device-gone event with the kernel.
*Not* necessarily with SCSI Core, in fact I'd rather it didn't,
but with a well defined kernel entry for device-gone events.

At the same time the LLDD will start returning TARGET gone, or
whatever is appropriate to newly queued commands, and error out
all internally queued commands (if it does it's own queuing).
(I've seen this work nicely on mount and read/write(2) and fsck.)

I.e. the ``synchronization'' has started already by the LLDD erroring
out commands, new and queued.

All the while the kernel has started higher level cleaning up,
decrementing ref counts, etc, stuff which may not be so easy to be
cleaned up just by LLDD returning TARGET error.  Even though,
good design dictates that complete cleaning up should happen just
by the LLDD returning TARGET error (e.g. on mount), we *have* to allow
for this immediate high level entry point (as I mentioned above) notification,
which will be kind of ``meeting place'' for events like this.

Depending on what needs to be done at those ``higher'' levels, the
event will eventually bubble down to the SCSI Core with something like
scsi_remove_device() which will do slave_destroy() in the driver.

The point is that at that point in time, it will be *safe* to do
scsi_remove_device() as all ULP have alreay been notified, and they've
relinquished their use of the LLD (Low Level Device), thus the safety.

But there's no such thing as ``waiting around indefinitely'' or
``blocking wait'' as you've suggested in some of your emails.

Even if this UL entry point doesn't do anything, ref counts should
go to zero, after all users error out on this device, at which point
the user can remove the device from *the system* by hand/old method
through proc or whatever finalizes for 2.6.

Those are more or less my thoughts on the subject.

-- 
Luben

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [PATCH] USB changes for 2.5.58
  2003-01-21 18:16                                         ` Luben Tuikov
@ 2003-01-21 19:00                                           ` Oliver Neukum
  2003-01-21 20:02                                             ` [linux-usb-devel] " Luben Tuikov
  2003-01-22 21:30                                           ` [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 David Brownell
  1 sibling, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-21 19:00 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: David Brownell, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Am Dienstag, 21. Januar 2003 19:16 schrieb Luben Tuikov:
> Oliver Neukum wrote:
> > Disconnect is not really a callback. There's a distinct lack of a back
> > movement here. khubd -> usbcore -> disconnect() in driver -> [layer on
> > top]
> >
> > The proposed API in SCSI looks like:
> > <bus system> -> LLD -> midlayer -> top layer -> midlayer -> LLD with
> > destroy_slave() and that's not OK.
>
> No, not quite.
>
> When the Low Level Device Driver (LLDD), being the transport portal,
> notices that the device is going away or has gone away from the
> ``fabric'' (wlg), it will fire a device-gone event with the kernel.
> *Not* necessarily with SCSI Core, in fact I'd rather it didn't,
> but with a well defined kernel entry for device-gone events.

Well, we are in feature freeze. I see no alternative but to notify
the mid layer. Who else but the mid layer knows what a physical device
is logically associated with?

> At the same time the LLDD will start returning TARGET gone, or
> whatever is appropriate to newly queued commands, and error out
> all internally queued commands (if it does it's own queuing).
> (I've seen this work nicely on mount and read/write(2) and fsck.)

Right.

> I.e. the ``synchronization'' has started already by the LLDD erroring
> out commands, new and queued.
>
> All the while the kernel has started higher level cleaning up,
> decrementing ref counts, etc, stuff which may not be so easy to be
> cleaned up just by LLDD returning TARGET error.  Even though,

You cannot really make anything depend on errors returned, because
there simply may not be any commands queued. You can make it a
requirement for an LLDD to return all commands in flight with an error,
but you can do little with these errors. Basically you have to treat them
like uncorrectable errors, except maybe for the error code returned to
user space. But the processing of the disconnect itself should be triggered
by the LLDD's notification, because it's the only indication of an unplug
event you are sure to get.

> good design dictates that complete cleaning up should happen just
> by the LLDD returning TARGET error (e.g. on mount), we *have* to allow
> for this immediate high level entry point (as I mentioned above)
> notification, which will be kind of ``meeting place'' for events like this.

That I don't understand. It would seem to me to be cleanest to have just
one path to process a disconnect event.

> Depending on what needs to be done at those ``higher'' levels, the
> event will eventually bubble down to the SCSI Core with something like
> scsi_remove_device() which will do slave_destroy() in the driver.
>
> The point is that at that point in time, it will be *safe* to do
> scsi_remove_device() as all ULP have alreay been notified, and they've
> relinquished their use of the LLD (Low Level Device), thus the safety.

But there can be no users of the LLDD at this point. There can of
course be references to devices and hosts, but not really uses.
After we have done a notification of the event the first things to do
are to make further opening of the device fail and make sure no more
commands are sent to the device. Likewise all queued commands have
returned with an error. So at this point it's impossible to use an unplugged
device.

> But there's no such thing as ``waiting around indefinitely'' or
> ``blocking wait'' as you've suggested in some of your emails.
>
> Even if this UL entry point doesn't do anything, ref counts should
> go to zero, after all users error out on this device, at which point
> the user can remove the device from *the system* by hand/old method
> through proc or whatever finalizes for 2.6.

You cannot be sure that reference counts will go to zero ever.
You can be sure that they won't increase as you can fail any operation that
would cause them to increase, but you cannot force userland to close its fds.
And waiting for somebody to remove a device is wrong. It's gone physically.
There's no choice but to remove it. The refcounts can tell you when to free
data structures associated with devices, but what else do you want them to do?

	Regards
		Oliver

-------------------------------------------------------
This SF.net email is sponsored by: Scholarships for Techies!
Can't afford IT training? All 2003 ictp students receive scholarships.
Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more.
www.ictp.com/training/sourceforge.asp
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-21 19:00                                           ` Oliver Neukum
@ 2003-01-21 20:02                                             ` Luben Tuikov
  2003-01-21 21:02                                               ` Alan Stern
  0 siblings, 1 reply; 106+ messages in thread
From: Luben Tuikov @ 2003-01-21 20:02 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: David Brownell, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Oliver Neukum wrote:
> Am Dienstag, 21. Januar 2003 19:16 schrieb Luben Tuikov:
> 
>>
>>When the Low Level Device Driver (LLDD), being the transport portal,
>>notices that the device is going away or has gone away from the
>>``fabric'' (wlg), it will fire a device-gone event with the kernel.
>>*Not* necessarily with SCSI Core, in fact I'd rather it didn't,
>>but with a well defined kernel entry for device-gone events.
> 
> 
> Well, we are in feature freeze. I see no alternative but to notify
> the mid layer. Who else but the mid layer knows what a physical device
> is logically associated with?

Yes, we're in  feature freeze.  I realize this and the fact
that this may be 2.7 work, but it's nevetheless worth
to brainstorm the issue.

I think one needs to notify at a higher level -- (some) decision making
may/will be made there. SCSI Core will be notified eventually, or maybe
right away. For all we know, the policy of removing a device could
be to just go into SCSI Core with the removal -- but the point is
that you need to notify at a higher level.

In due time, SCSI Core has no problem with a device disappearing.
As I mentioned already, the event will ``bubble down'' to SCSI Core,
at some point or immediately.

> 
>>At the same time the LLDD will start returning TARGET gone, or
>>whatever is appropriate to newly queued commands, and error out
>>all internally queued commands (if it does it's own queuing).
>>(I've seen this work nicely on mount and read/write(2) and fsck.)
> 
> 
> Right.

I've been saying (repeating) this for my last 3-4 emails.  Glad to
hear we've come to some kind of agreement. :-)

>>I.e. the ``synchronization'' has started already by the LLDD erroring
>>out commands, new and queued.
>>
>>All the while the kernel has started higher level cleaning up,
>>decrementing ref counts, etc, stuff which may not be so easy to be
>>cleaned up just by LLDD returning TARGET error.  Even though,
> 
> 
> You cannot really make anything depend on errors returned, because
> there simply may not be any commands queued. You can make it a

Exactly.  The more reason to have a notification even at a higher
level, because *if* you had users and whatnot using the device
then you'd want to let them/it know.  You need a higher level hook.
I can see a ton of uses for such a higher level hook.

> requirement for an LLDD to return all commands in flight with an error,
> but you can do little with these errors. Basically you have to treat them

As I've said, I've seen this method work nicely with mount and
fsck -- they time out almost right away, with different errors
of course, but LLDD returns TARGET error all the while.

So, either way (users or none), a higher level hook would seem
like a more general approach.

> like uncorrectable errors, except maybe for the error code returned to
> user space. But the processing of the disconnect itself should be triggered
> by the LLDD's notification, because it's the only indication of an unplug
> event you are sure to get.

I think this is the first thing I mentioned yesterday when I wrote
``transport initiated event''.

>>good design dictates that complete cleaning up should happen just
>>by the LLDD returning TARGET error (e.g. on mount), we *have* to allow
>>for this immediate high level entry point (as I mentioned above)
>>notification, which will be kind of ``meeting place'' for events like this.
> 
> 
> That I don't understand. It would seem to me to be cleanest to have just
> one path to process a disconnect event.

I also think that there should be one path: LLDD starts returning
TARGET error and all the while cleaning up has started from the top.

>>Depending on what needs to be done at those ``higher'' levels, the
>>event will eventually bubble down to the SCSI Core with something like
>>scsi_remove_device() which will do slave_destroy() in the driver.
>>
>>The point is that at that point in time, it will be *safe* to do
>>scsi_remove_device() as all ULP have alreay been notified, and they've
>>relinquished their use of the LLD (Low Level Device), thus the safety.
> 
> 
> But there can be no users of the LLDD at this point. There can of
> course be references to devices and hosts, but not really uses.

The more reason for a higher level hook -- you see, it generalizes
the cases of users and no users using the device -- you have it covered
both ways.  See my comments above.

> After we have done a notification of the event the first things to do
> are to make further opening of the device fail and make sure no more
> commands are sent to the device. Likewise all queued commands have
> returned with an error. So at this point it's impossible to use an unplugged
> device.

So here I take it you agree with me.

>>But there's no such thing as ``waiting around indefinitely'' or
>>``blocking wait'' as you've suggested in some of your emails.
>>
>>Even if this UL entry point doesn't do anything, ref counts should
>>go to zero, after all users error out on this device, at which point
>>the user can remove the device from *the system* by hand/old method
>>through proc or whatever finalizes for 2.6.
> 
> 
> You cannot be sure that reference counts will go to zero ever.
> You can be sure that they won't increase as you can fail any operation that
> would cause them to increase, but you cannot force userland to close its fds.
> And waiting for somebody to remove a device is wrong. It's gone physically.
> There's no choice but to remove it. The refcounts can tell you when to free
> data structures associated with devices, but what else do you want them to do?

A agree with all this.  What I was saying is the flexibility of the policy.
Yes, it is correct that we cannot force userland to close its fd's.
Just as you cannot force a parent process to collect child exit status :-) .
(Idea!)

I'm glad to see we're coming to an agreement.

-- 
Luben

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-21 20:02                                             ` [linux-usb-devel] " Luben Tuikov
@ 2003-01-21 21:02                                               ` Alan Stern
  2003-01-22 21:50                                                 ` Luben Tuikov
  0 siblings, 1 reply; 106+ messages in thread
From: Alan Stern @ 2003-01-21 21:02 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Oliver Neukum, David Brownell, Matthew Dharm, Mike Anderson,
	Greg KH, linux-usb-devel, Linux SCSI list

Here's another question to add to your discussion.

When a device is unplugged, the system's representation of that device
can't be removed immediately; there may be open fd's, mounts, pointers,
and so on.  Until the time comes when all these handles are released, all
interaction with the device has to fail, one way or another.

Whose responsibility is it to fail these interactions?

For something simple, like a USB serial port, it might turn out that the 
low-level device driver gets all these requests and then fails them.  That 
means the driver has to keep track of the fact that the device is no 
longer connected until some reference count goes to 0.

For SCSI and emulated SCSI devices, it might be the one of the SCSI layers 
that keeps track of the fact that the device has disconnected.  Or it 
might be somewhere else in the kernel.

It would nice to have some sort of coherent plan for how to handle this.  
In fact, it ought to be part of the device-driver model that underlies 
sysfs.  But so far as I am aware, there is currently nothing in the sysfs 
documentation to address the problem.

Alan Stern

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-21 21:02                                               ` Alan Stern
@ 2003-01-22 21:50                                                 ` Luben Tuikov
  2003-01-22 22:46                                                   ` Oliver Neukum
  0 siblings, 1 reply; 106+ messages in thread
From: Luben Tuikov @ 2003-01-22 21:50 UTC (permalink / raw)
  To: Alan Stern
  Cc: Oliver Neukum, David Brownell, Matthew Dharm, Mike Anderson,
	Greg KH, linux-usb-devel, Linux SCSI list

Alan Stern wrote:
> Here's another question to add to your discussion.
> 
> When a device is unplugged, the system's representation of that device
> can't be removed immediately; there may be open fd's, mounts, pointers,
> and so on.  Until the time comes when all these handles are released, all
> interaction with the device has to fail, one way or another.
> 
> Whose responsibility is it to fail these interactions?

The transport.

When a device is plugged to the SAN/fabric (wlg) it may not be the case
that all initiators will know about it.  For this reason the transport
itself, not SCSI Core, not LLDD*, will decide if the CDB is deliverable.

* An LLDD may keep a table of seen devices (depending on the transport
it provides a portal to), and thus the decision may be made there, but
this doesn't have to be the case.

(Just think of SANs/IP SANs.)

> For something simple, like a USB serial port, it might turn out that the 
> low-level device driver gets all these requests and then fails them.  That 
> means the driver has to keep track of the fact that the device is no 
> longer connected until some reference count goes to 0.

A LLDD doesn't have to keep reference counts.  In the simple case
you mention above, it will check that the device is no longer reachable
and will return TARGET error, which will bubble up the layers, or the
Execute Command remote procedure will end with Service Delivery Failure
as the Service Response -- exactly the same effect as far as SCSI Core
is concerned.

The Service Response is Service Delivery Failure, in which case the
Status byte is undefined.

I've been wanting to include a Service Response into scsi_cmnd and
rename ``result'' into ``status'' to be closer to the SCSI Architecure,
for some time now, but we'll see when this will happen.

Newer drivers will make use of Service Response code, and be able to
address only by (target, lun) rather than (bus, target, lun), and
target may not be an int anymore.  But this is 2.7 stuff, or maybe
a separately distributed SCSI Core and LLDDs subsystem...

> For SCSI and emulated SCSI devices, it might be the one of the SCSI layers 
> that keeps track of the fact that the device has disconnected.  Or it 
> might be somewhere else in the kernel.

Right.  For this reason I'm thinking that a higher level hook on device
disconnect (if reported by the transport) might be needed.  This doesn't
mean that it has to do higher-level things, :-) , it might just call
SCSI Core, but as long as it goes through a higher layer.

> It would nice to have some sort of coherent plan for how to handle this.  
> In fact, it ought to be part of the device-driver model that underlies 
> sysfs.  But so far as I am aware, there is currently nothing in the sysfs 
> documentation to address the problem.

-- 
Luben

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-22 21:50                                                 ` Luben Tuikov
@ 2003-01-22 22:46                                                   ` Oliver Neukum
  2003-01-23 17:46                                                     ` Luben Tuikov
  0 siblings, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-22 22:46 UTC (permalink / raw)
  To: Luben Tuikov, Alan Stern
  Cc: David Brownell, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

> > Whose responsibility is it to fail these interactions?
>
> The transport.
>
> When a device is plugged to the SAN/fabric (wlg) it may not be the case
> that all initiators will know about it.  For this reason the transport
> itself, not SCSI Core, not LLDD*, will decide if the CDB is deliverable.
>
> * An LLDD may keep a table of seen devices (depending on the transport
> it provides a portal to), and thus the decision may be made there, but
> this doesn't have to be the case.
>
> (Just think of SANs/IP SANs.)

Not all the world is a SAN. USB has no possibility to even try an interaction
after the device is gone. We have to handle this flexibly. In fact, if a device
can vanish without a LLDD knowing about it, this is purely a problem of the
SCSI layer.

> > For something simple, like a USB serial port, it might turn out that the
> > low-level device driver gets all these requests and then fails them. 
> > That means the driver has to keep track of the fact that the device is no
> > longer connected until some reference count goes to 0.
>
> A LLDD doesn't have to keep reference counts.  In the simple case
> you mention above, it will check that the device is no longer reachable
> and will return TARGET error, which will bubble up the layers, or the

That is something that is impossible to some LLDDs. We have to keep
a record about devices and busses we can reach and can delete these
records only after we positively know that no more commands will come
down to the LLDD.

The alternative would be to check a table of available devices for every
command.

That means that we have to have a way to ensure that no more commands
will reach the LLDD which can be triggered without any commands to be
executed at all. This functionality has to come from the scsi mid layer.

> Newer drivers will make use of Service Response code, and be able to
> address only by (target, lun) rather than (bus, target, lun), and
> target may not be an int anymore.  But this is 2.7 stuff, or maybe
> a separately distributed SCSI Core and LLDDs subsystem...

Yes, but we need a solution for 2.6.
And it has to be reasonably simple.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-22 22:46                                                   ` Oliver Neukum
@ 2003-01-23 17:46                                                     ` Luben Tuikov
  2003-01-23 18:19                                                       ` Oliver Neukum
  0 siblings, 1 reply; 106+ messages in thread
From: Luben Tuikov @ 2003-01-23 17:46 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Oliver Neukum wrote:
> 
> Not all the world is a SAN. USB has no possibility to even try an interaction
> after the device is gone. We have to handle this flexibly.

Thus the example in the original post.  I.e. for simple transports whose
portals get notified when a device is plugged off (USB), the LLDD
can notify SCSI Core, by setting a state variable in scsi_device.
In which case SCSI Core can answer with the proper TARGET error code.
(This was outlined before, scsi_command->online:1 ...)

> In fact, if a device
> can vanish without a LLDD knowing about it, this is purely a problem of the
> SCSI layer.

No, of course not.  (Think of IP.)  When a device vanishes and LLDD doesn't
know about it (more complicated transports), the CDB will return with
the proper Service Response, since the transport(s) won't be able to deliver
it. This will bubble up through SCSI Core and the error returned will have
to be the same as that of the simpler transports, as outlined above.

>>>For something simple, like a USB serial port, it might turn out that the
>>>low-level device driver gets all these requests and then fails them. 
>>>That means the driver has to keep track of the fact that the device is no
>>>longer connected until some reference count goes to 0.
>>
>>A LLDD doesn't have to keep reference counts.  In the simple case
>>you mention above, it will check that the device is no longer reachable
>>and will return TARGET error, which will bubble up the layers, or the
> 
> 
> That is something that is impossible to some LLDDs. We have to keep
> a record about devices and busses we can reach and can delete these
> records only after we positively know that no more commands will come
> down to the LLDD.

But USB does keep such a record, doesn't it?   *Even if it doesn't*, as outlined
above, it can set a state variable in scsi_device and SCSI Core can take over
for error return values.

> The alternative would be to check a table of available devices for every
> command.

A command is destined to a device, at SCSI Core queuing logic a check
can be made... No need to go through tables of devices.

> That means that we have to have a way to ensure that no more commands
> will reach the LLDD which can be triggered without any commands to be
> executed at all. This functionality has to come from the scsi mid layer.

For simple transports yes; for more complicated ones, the CDB will
not be able to be delivered, and will return with error.

>>Newer drivers will make use of Service Response code, and be able to
>>address only by (target, lun) rather than (bus, target, lun), and
>>target may not be an int anymore.  But this is 2.7 stuff, or maybe
>>a separately distributed SCSI Core and LLDDs subsystem...
> 
> 
> Yes, but we need a solution for 2.6.
> And it has to be reasonably simple.

I think we have enough ideas to implement a reasonable one.

-- 
Luben

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 17:46                                                     ` Luben Tuikov
@ 2003-01-23 18:19                                                       ` Oliver Neukum
  2003-01-23 19:07                                                         ` Luben Tuikov
  2003-01-23 20:41                                                         ` A different look at block device hotswap in the Linux kernel Steven Dake
  0 siblings, 2 replies; 106+ messages in thread
From: Oliver Neukum @ 2003-01-23 18:19 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Am Donnerstag, 23. Januar 2003 18:46 schrieb Luben Tuikov:
> Oliver Neukum wrote:
> > Not all the world is a SAN. USB has no possibility to even try an
> > interaction after the device is gone. We have to handle this flexibly.
>
> Thus the example in the original post.  I.e. for simple transports whose
> portals get notified when a device is plugged off (USB), the LLDD
> can notify SCSI Core, by setting a state variable in scsi_device.
> In which case SCSI Core can answer with the proper TARGET error code.
> (This was outlined before, scsi_command->online:1 ...)

Very well, so you agree that the SCSI layer should export to the LLDD
a function to set devices offline?

> > In fact, if a device
> > can vanish without a LLDD knowing about it, this is purely a problem of
> > the SCSI layer.
>
> No, of course not.  (Think of IP.)  When a device vanishes and LLDD doesn't
> know about it (more complicated transports), the CDB will return with
> the proper Service Response, since the transport(s) won't be able to
> deliver it. This will bubble up through SCSI Core and the error returned
> will have to be the same as that of the simpler transports, as outlined
> above.

Yes, sorry. To be precise, this means that the LLDD has to do nothing
special, as it has to implement checking for a failing command anyway.
But it's not entirely the same. If a command cannot be delivered it may or may
not be appropriate to start error recovery. After the LLDD has told
the SCSI layer that it has noticed a device going away, there must be no
error recovery.

> > That means that we have to have a way to ensure that no more commands
> > will reach the LLDD which can be triggered without any commands to be
> > executed at all. This functionality has to come from the scsi mid layer.
>
> For simple transports yes; for more complicated ones, the CDB will
> not be able to be delivered, and will return with error.

Good.
So the first thing a LLDD has to do after it has learned about a device
being removed is to have the device block.
1. set device offline
But commands may still be in flight.IMHO it is not right to assume that
all commands now in flight to a device have failed, as some may have
completed successfully in time, or failed for other reasons than unplugging.
So it should be the LLDD's responsibility to finish the outstanding commands.
Furthermore, there's a window for commands already having passed the check
for offline but not yet being noticed by the LLDD. The simplest solution is to
use a waiting primitive from RCU. So we are at:

1. set device offline
2. synchronize the kernel
3. finish all pending commands

So far with me?
The LLDD could now forget about the device and be done with it.
However there's a problem left. The device may come back.
What happens if a device with the same ID is reconnected?

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 18:19                                                       ` Oliver Neukum
@ 2003-01-23 19:07                                                         ` Luben Tuikov
  2003-01-23 19:40                                                           ` Oliver Neukum
  2003-01-23 20:41                                                         ` A different look at block device hotswap in the Linux kernel Steven Dake
  1 sibling, 1 reply; 106+ messages in thread
From: Luben Tuikov @ 2003-01-23 19:07 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Oliver Neukum wrote:
> 
> Very well, so you agree that the SCSI layer should export to the LLDD
> a function to set devices offline?

I've never really disagreed -- simpler transports will make use
of such a function.  The important point to note is that the error
return value for simpler and more complicated transports has to
be the same (i.e. ones which know about the device disconnect and
others which send out the CDB and which will return with error).

I forgot to mention this with my previous email: think of a LLDD
more as part of the transport than of SCSI Core.

> Yes, sorry. To be precise, this means that the LLDD has to do nothing
> special, as it has to implement checking for a failing command anyway.
> But it's not entirely the same. If a command cannot be delivered it may or may
> not be appropriate to start error recovery. After the LLDD has told
> the SCSI layer that it has noticed a device going away, there must be no
> error recovery.

Error recovery should not be that complicated for device being disconnected,
just error out all commands new and old -- as I've said so many times.
The command structs will return back to LLDD and all will be good.
(Simple transports.)

More complicated transports will just return Service Delivery Failure.

(See below.)

> Good.
> So the first thing a LLDD has to do after it has learned about a device
> being removed is to have the device block.

``block'' (verb) is such a strong word.

* Simple transports: call scsi_set_device_offline(dev) or something like this.

* More complicated transports: SCSI Core sees Service Response of Service
Delivery Failure and it itself calls scsi_set_device_offline(dev).

scsi_set_device_offline(dev) calls a high-level kernel function to start
higher level things (block queue cut off, etc) which *may* need to be done.

The control path will eventually bubble down to SCSI Core which will
error out already queued commands (unless they've returned already
with the appropriate error code), remove the device, etc, etc, etc.

> 1. set device offline
> But commands may still be in flight.IMHO it is not right to assume that
> all commands now in flight to a device have failed, as some may have
> completed successfully in time, or failed for other reasons than unplugging.

They will just return with ok status and after a certain point in time,
all others will return with the appropriate error -- in which case see above.

> So it should be the LLDD's responsibility to finish the outstanding commands.

LLDD cannot really ``finish'' outstanding commands, it's just a transport
portal.

> Furthermore, there's a window for commands already having passed the check
> for offline but not yet being noticed by the LLDD.

They will return with an appropriate error.

> The simplest solution is to
> use a waiting primitive from RCU. So we are at:
> 
> 1. set device offline
> 2. synchronize the kernel
> 3. finish all pending commands

I told you before: 3 starts *before* 2 and 3 is *part* of 2.
Furthermore, after 1 has happened in time, all pending commands
will error out (wrt a time line).

2 is what I call ``higher-level hook'', but it's not really
``synchronization''. Synchronization will take delta-time, it
will not happen instantaneously.

> So far with me?
> The LLDD could now forget about the device and be done with it.

Some LLDD will not have the concept of device -- they'll just
set up the remote procedure call Execute Command and initiate
it, given a target and LUN. Who knows what happens after that?
I mean the command may go through several transports..., the LUN
may get translated a few times, etc.

We have to keep this in mind.
And some other transports will know about devices.

I.e. you have to allow for the possibility of a command
being sent to a non-existent device through LLDD, in which
case the LLDD/transport will have to error it out.

> However there's a problem left. The device may come back.
> What happens if a device with the same ID is reconnected?

-- 
Luben

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 19:07                                                         ` Luben Tuikov
@ 2003-01-23 19:40                                                           ` Oliver Neukum
  2003-01-23 20:28                                                             ` Doug Ledford
  0 siblings, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-23 19:40 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Am Donnerstag, 23. Januar 2003 20:07 schrieb Luben Tuikov:
> Oliver Neukum wrote:
> > Very well, so you agree that the SCSI layer should export to the LLDD
> > a function to set devices offline?
>
> I've never really disagreed -- simpler transports will make use
> of such a function.  The important point to note is that the error

Good.

> return value for simpler and more complicated transports has to
> be the same (i.e. ones which know about the device disconnect and
> others which send out the CDB and which will return with error).

Why? It throws away information needlessly. If the LLDD knows
that the reason is unplugging why not report it? A LLDD that doesn't
know about devices going away on the other hand can just report
an error. Can the higher layers simply assume that the device was unplugged?
IMHO they can't and should at least try to recover from the error.

> I forgot to mention this with my previous email: think of a LLDD
> more as part of the transport than of SCSI Core.

Hard to do. The scsi mid layer does timing out and error handling.
There's a relatively tight connection.

> > So the first thing a LLDD has to do after it has learned about a device
> > being removed is to have the device block.
>
> ``block'' (verb) is such a strong word.

What do you prefer ? ;-) I'll certainly use another word if you like me to
do so.

> * Simple transports: call scsi_set_device_offline(dev) or something like
> this.
>
> * More complicated transports: SCSI Core sees Service Response of Service
> Delivery Failure and it itself calls scsi_set_device_offline(dev).
>
> scsi_set_device_offline(dev) calls a high-level kernel function to start
> higher level things (block queue cut off, etc) which *may* need to be done.

How do you differentiate between real failure and device removal?

> > So it should be the LLDD's responsibility to finish the outstanding
> > commands.
>
> LLDD cannot really ``finish'' outstanding commands, it's just a transport
> portal.

Well, report back the results, if you prefer, thus returning ownership to
higher layers.

> > Furthermore, there's a window for commands already having passed the
> > check for offline but not yet being noticed by the LLDD.
>
> They will return with an appropriate error.

Not quite so simple. Some LLDDs need to know at some point that
no more commands will arrive for sure and none are still in flight.

> > The simplest solution is to
> > use a waiting primitive from RCU. So we are at:
> >
> > 1. set device offline
> > 2. synchronize the kernel
> > 3. finish all pending commands
>
> I told you before: 3 starts *before* 2 and 3 is *part* of 2.
> Furthermore, after 1 has happened in time, all pending commands
> will error out (wrt a time line).

That's not enough. The LLDD has to know that they've errored
in order to free associated data structures. The simplest way to
do so is returning them to higher layers.

> 2 is what I call ``higher-level hook'', but it's not really
> ``synchronization''. Synchronization will take delta-time, it
> will not happen instantaneously.

Please explain.

> I.e. you have to allow for the possibility of a command
> being sent to a non-existent device through LLDD, in which
> case the LLDD/transport will have to error it out.

Non existant is OK. For a device already flagged offline is not OK.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 19:40                                                           ` Oliver Neukum
@ 2003-01-23 20:28                                                             ` Doug Ledford
  2003-01-23 20:59                                                               ` Oliver Neukum
                                                                                 ` (2 more replies)
  0 siblings, 3 replies; 106+ messages in thread
From: Doug Ledford @ 2003-01-23 20:28 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

On Thu, Jan 23, 2003 at 08:40:41PM +0100, Oliver Neukum wrote:
> Am Donnerstag, 23. Januar 2003 20:07 schrieb Luben Tuikov:
> 
> > return value for simpler and more complicated transports has to
> > be the same (i.e. ones which know about the device disconnect and
> > others which send out the CDB and which will return with error).
> 
> Why? It throws away information needlessly. If the LLDD knows
> that the reason is unplugging why not report it?

What does it matter?  If you know that the device was unplugged, are you 
going to then wait for it to be plugged back in and if it's plugged back 
in (and confirmed to be the same device via serial number or some such) 
pick back up where you left off like nothing happened?  That would be the 
only reason to care about whether it was unplugged or died.  And if you 
did that you would probably violate the rule of least suprise with people 
unplugging their hard disks in a fit when they realize they did rm -fr on 
the drive and then when they plug it back in to check out how much damage 
they did it picks back up where it left off!

> A LLDD that doesn't
> know about devices going away on the other hand can just report
> an error.

For SPI that would be DID_TIMEOUT, aka the device wasn't there.  For iSCSI 
it would be timeout as well.  Pretty much anything is simply going to be 
either timeout or if you know it's gone you could return some other error.

> Can the higher layers simply assume that the device was unplugged?
> IMHO they can't and should at least try to recover from the error.

Correct.  And this isn't a problem.

> > I forgot to mention this with my previous email: think of a LLDD
> > more as part of the transport than of SCSI Core.
> 
> Hard to do. The scsi mid layer does timing out and error handling.
> There's a relatively tight connection.
> 
> > > So the first thing a LLDD has to do after it has learned about a device
> > > being removed is to have the device block.
> >
> > ``block'' (verb) is such a strong word.
> 
> What do you prefer ? ;-) I'll certainly use another word if you like me to
> do so.
> 
> > * Simple transports: call scsi_set_device_offline(dev) or something like
> > this.
> >
> > * More complicated transports: SCSI Core sees Service Response of Service
> > Delivery Failure and it itself calls scsi_set_device_offline(dev).

Actually, I would have both complicated and simple transports call 
scsi_set_device_offline() and for two reasons.  1) you have to provide 
that function for simple drivers so duplicating other detection code in 
the scsi completion handler is a waste.  2) pretty much all transports 
will learn of the device being offline while they are in their interrupt 
handler and should already be holding the lock for the device, which means 
that calling scsi_set_device_offline() won't race with scsi_request_fn() 
which also needs the device lock (which in reality is the host lock).  
Saving this race is convenient enough IMHO to warrant saying that's the 
way things need to be.

> > scsi_set_device_offline(dev) calls a high-level kernel function to start
> > higher level things (block queue cut off, etc) which *may* need to be done.

No, scsi_set_device_offline() schedules the error handler thread for that 
host to be woken up.

> How do you differentiate between real failure and device removal?

We don't, and we shouldn't.  Device removal *is* a real failure.

> > > So it should be the LLDD's responsibility to finish the outstanding
> > > commands.
> >
> > LLDD cannot really ``finish'' outstanding commands, it's just a transport
> > portal.
> 
> Well, report back the results, if you prefer, thus returning ownership to
> higher layers.

If the LLDD is the type such that it knows the device is gone (aka, in my 
driver if I get a selection timeout then I know something is fishy and can 
proceed from there, iSCSI may not be so lucky), then it has one of two 
choices.  1) it may flush any commands that it can out of the hardware and 
return them immediately with the same error condition as the one that it 
is already returning.  2) it can sit and wait for the commands to timeout 
one by one if that's what it wants.  Since the device has already been 
marked offline by scsi_set_device_offline() and the error handler thread 
is already scheduled to run for the device, 2 is probably the easiest 
thing for the driver to do.  The error handler will call the abort/reset 
routine for each command still outstanding and the LLDD can just clean up 
one at a time and return them as it would under any other error condition.

> > > Furthermore, there's a window for commands already having passed the
> > > check for offline but not yet being noticed by the LLDD.

No, not if you handle things in the interrupt handler and if your 
interrupt handler holds the host lock like it's suppossed to.  If you want 
to go without using these locking methods in your lldd then you are free 
to do so, but that means *you* need to handle this situation in your 
driver, the mid layer shouldn't be trying to solve this problem for lldd 
that want to be lock free.

> > They will return with an appropriate error.
> 
> Not quite so simple. Some LLDDs need to know at some point that
> no more commands will arrive for sure and none are still in flight.

Follow the simple rule above and when you call scsi_set_device_offline() 
then you will *never* get called in your queuecommand() for that device 
again.  That is separate from all commands being cleaned up.  That will 
happen after the error handler thread has run and cleaned your driver out.  
Once all the commands are gone and no more are arriving, then if, and only 
if, someone actually removes the device from the scsi subsystem (maybe 
hotplug manager or something) then you will get the typical 
slave_destroy() call to tell you that it is safe to release all resources 
related to this device.  Otherwise, the device will hang around as an 
offline device until someone does echo "scsi-remove-single-device a b c d" 
> /proc/scsi/scsi to remove it.

Basically, as I see it, we need a new function scsi_set_device_offline()  
that marks the device offline, we need an offline check in
scsi_request_fn(), and we need scsi_set_device_offline() to schedule the
error handler thread for wakeup (and it should flag the device that needs
recovered so that the error handler thread knows what to do), then the
error handler thread routine needs modified to understand what to do with
a device that's been offlined with commands outstanding, and once all the
commands are returned it should signal the higher layer (block or
whatever) that the device is offlined.  Sounds like about an afternoons
worth of work to me and should solve the issues you are bringing up.

As far as plugging back in, the answer is simple.  Until the old instance 
is dead *and removed* a new one can't be added at the same ID, aka you 
simply ignore the hot plug until the hot remove has completed.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 20:28                                                             ` Doug Ledford
@ 2003-01-23 20:59                                                               ` Oliver Neukum
  2003-01-23 21:34                                                                 ` Doug Ledford
  2003-01-24  0:15                                                               ` Patrick Mansfield
  2003-01-24  8:33                                                               ` David Brownell
  2 siblings, 1 reply; 106+ messages in thread
From: Oliver Neukum @ 2003-01-23 20:59 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

Hi Doug

> Actually, I would have both complicated and simple transports call
> scsi_set_device_offline() and for two reasons.  1) you have to provide
> that function for simple drivers so duplicating other detection code in
> the scsi completion handler is a waste.  2) pretty much all transports
> will learn of the device being offline while they are in their interrupt
> handler and should already be holding the lock for the device, which means

This is not the case for USB and IEEE1394. I am not sure about PCMCIA.
We are in context of a kernel thread while we learn about device removal.

> that calling scsi_set_device_offline() won't race with scsi_request_fn()
> which also needs the device lock (which in reality is the host lock).
> Saving this race is convenient enough IMHO to warrant saying that's the
> way things need to be.
>
> > > scsi_set_device_offline(dev) calls a high-level kernel function to
> > > start higher level things (block queue cut off, etc) which *may* need
> > > to be done.
>
> No, scsi_set_device_offline() schedules the error handler thread for that
> host to be woken up.
>
> > How do you differentiate between real failure and device removal?
>
> We don't, and we shouldn't.  Device removal *is* a real failure.

Well shouldn't a device removal remove the device as a logical
entity and a failure should not?

> If the LLDD is the type such that it knows the device is gone (aka, in my
> driver if I get a selection timeout then I know something is fishy and can
> proceed from there, iSCSI may not be so lucky), then it has one of two
> choices.  1) it may flush any commands that it can out of the hardware and
> return them immediately with the same error condition as the one that it
> is already returning.  2) it can sit and wait for the commands to timeout
> one by one if that's what it wants.  Since the device has already been
> marked offline by scsi_set_device_offline() and the error handler thread
> is already scheduled to run for the device, 2 is probably the easiest
> thing for the driver to do.  The error handler will call the abort/reset

Again not for USB and IEEE1394. We'd have to wait for the error handler
to finish. Doing it ourselves is easier.

> Once all the commands are gone and no more are arriving, then if, and only
> if, someone actually removes the device from the scsi subsystem (maybe
> hotplug manager or something) then you will get the typical
> slave_destroy() call to tell you that it is safe to release all resources
> related to this device.  Otherwise, the device will hang around as an
> offline device until someone does echo "scsi-remove-single-device a b c d"

Eek. That part I must strongly object to. The device is physically gone.
Ever bothering the LLDD with it is very inconvinient.

> > /proc/scsi/scsi to remove it.
>
> Basically, as I see it, we need a new function scsi_set_device_offline()
> that marks the device offline, we need an offline check in

These functions are needed for a whole bus as well. USB needs it.

> As far as plugging back in, the answer is simple.  Until the old instance
> is dead *and removed* a new one can't be added at the same ID, aka you
> simply ignore the hot plug until the hot remove has completed.

What do you mean? It is dead because it is removed. How can a device be
anything than dead if it has been unplugged? Please elaborate.

And who should ignore a hot addition, the LLDD or SCSI core.
If the former, again I must object.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 20:59                                                               ` Oliver Neukum
@ 2003-01-23 21:34                                                                 ` Doug Ledford
  2003-01-23 22:39                                                                   ` Oliver Neukum
  0 siblings, 1 reply; 106+ messages in thread
From: Doug Ledford @ 2003-01-23 21:34 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

On Thu, Jan 23, 2003 at 09:59:28PM +0100, Oliver Neukum wrote:
> Hi Doug
> 
> > Actually, I would have both complicated and simple transports call
> > scsi_set_device_offline() and for two reasons.  1) you have to provide
> > that function for simple drivers so duplicating other detection code in
> > the scsi completion handler is a waste.  2) pretty much all transports
> > will learn of the device being offline while they are in their interrupt
> > handler and should already be holding the lock for the device, which means
> 
> This is not the case for USB and IEEE1394. I am not sure about PCMCIA.
> We are in context of a kernel thread while we learn about device removal.

No.  You might be in a kernel thread context when you decode an interrupt 
down to determining that a device was removed, but somewhere along the 
line you took an interrupt that told you the device was removed (or else 
the command simply timed out and you are in the error handler for the 
command already).  Are you saying that the USB subsystem queues up those 
interrupt packets and decodes them later (which is fine, I just want to be 
clear on the point)?

> > that calling scsi_set_device_offline() won't race with scsi_request_fn()
> > which also needs the device lock (which in reality is the host lock).
> > Saving this race is convenient enough IMHO to warrant saying that's the
> > way things need to be.
> >
> > > > scsi_set_device_offline(dev) calls a high-level kernel function to
> > > > start higher level things (block queue cut off, etc) which *may* need
> > > > to be done.
> >
> > No, scsi_set_device_offline() schedules the error handler thread for that
> > host to be woken up.
> >
> > > How do you differentiate between real failure and device removal?
> >
> > We don't, and we shouldn't.  Device removal *is* a real failure.
> 
> Well shouldn't a device removal remove the device as a logical
> entity and a failure should not?

No.  That's what the user space hot plug manager is for.  If you want this 
type of behaviour, you take an interrupt to tell you that the device is 
gone, you mark it gone, the error handler cleans up any outstanding 
commands, then once the device no longer has any commands outstanding 
*then* the hot plug manager can successfully umount/unattach/whatever the 
device and then tell the kernel to actually remove it.  Putting this into 
the scsi stack when it's already in place elsewhere makes no sense to me.

> > If the LLDD is the type such that it knows the device is gone (aka, in my
> > driver if I get a selection timeout then I know something is fishy and can
> > proceed from there, iSCSI may not be so lucky), then it has one of two
> > choices.  1) it may flush any commands that it can out of the hardware and
> > return them immediately with the same error condition as the one that it
> > is already returning.  2) it can sit and wait for the commands to timeout
> > one by one if that's what it wants.  Since the device has already been
> > marked offline by scsi_set_device_offline() and the error handler thread
> > is already scheduled to run for the device, 2 is probably the easiest
> > thing for the driver to do.  The error handler will call the abort/reset
> 
> Again not for USB and IEEE1394. We'd have to wait for the error handler
> to finish. Doing it ourselves is easier.

OK, are you reading my comments or not?  I said "since the error handler 
thread is already scheduled to run for the device, 2 is probably easiest".  
In other words, you don't have to wait for anything, it's gonna happen 
post-haste.  So since you should already have proper error handling 
functions in place (You do have proper error handler functions in place, 
don't you?), duplicating that code here won't really buy you anything.

> > Once all the commands are gone and no more are arriving, then if, and only
> > if, someone actually removes the device from the scsi subsystem (maybe
> > hotplug manager or something) then you will get the typical
> > slave_destroy() call to tell you that it is safe to release all resources
> > related to this device.  Otherwise, the device will hang around as an
> > offline device until someone does echo "scsi-remove-single-device a b c d"
> 
> Eek. That part I must strongly object to. The device is physically gone.
> Ever bothering the LLDD with it is very inconvinient.

OK, let's look at this realistically.  I'm saying you get an interrupt 
telling you that the device is gone and you tell the scsi core the same 
thing.  Immediately after that the scsi core calls your error handler 
routines to clean up any pending commands on the device.  Once all those 
pending commands are cleaned up, the hot plug manager is free to remove 
the device from the system.  Once the hot plug manager calls for the free 
to happen, you get a slave_destroy() call and you free the instances.  
This all happens in a span of a few milliseconds most likely.  Is that 
really so inconvenient for you?

> > > /proc/scsi/scsi to remove it.
> >
> > Basically, as I see it, we need a new function scsi_set_device_offline()
> > that marks the device offline, we need an offline check in
> 
> These functions are needed for a whole bus as well. USB needs it.
> 
> > As far as plugging back in, the answer is simple.  Until the old instance
> > is dead *and removed* a new one can't be added at the same ID, aka you
> > simply ignore the hot plug until the hot remove has completed.
> 
> What do you mean? It is dead because it is removed. How can a device be
> anything than dead if it has been unplugged? Please elaborate.

I said "old instance", aka the internal data structs (struct scsi_device 
for that device).  A device can be dead but not removed from the scsi 
subsys if no one has cleaned up after the removal by unmounting any 
filesystems that were on it and removing the scsi device itself.  That 
would be the job of the hotplug manager.

> And who should ignore a hot addition, the LLDD or SCSI core.
> If the former, again I must object.

The scsi core doesn't allow two devices with the same complete ID set.  
You would either have to attach the device at a different ID (aka khubd 
could set the reattached device to a higher SCSI ID or something) or wait 
for the hot plug manager to complete the old instance of the device's 
removal before adding the device back in again.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 21:34                                                                 ` Doug Ledford
@ 2003-01-23 22:39                                                                   ` Oliver Neukum
  2003-01-23 23:23                                                                     ` Doug Ledford
  2003-01-23 23:25                                                                     ` Matthew Dharm
  0 siblings, 2 replies; 106+ messages in thread
From: Oliver Neukum @ 2003-01-23 22:39 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list


> No.  You might be in a kernel thread context when you decode an interrupt
> down to determining that a device was removed, but somewhere along the
> line you took an interrupt that told you the device was removed (or else
> the command simply timed out and you are in the error handler for the
> command already).  Are you saying that the USB subsystem queues up those
> interrupt packets and decodes them later (which is fine, I just want to be
> clear on the point)?

Yes, an interrupt occurs, it's queued, a kernel thread woken up, it decodes
the interrupts and notifies the device driver.

> No.  That's what the user space hot plug manager is for.  If you want this
> type of behaviour, you take an interrupt to tell you that the device is
> gone, you mark it gone, the error handler cleans up any outstanding
> commands, then once the device no longer has any commands outstanding
> *then* the hot plug manager can successfully umount/unattach/whatever the
> device and then tell the kernel to actually remove it.  Putting this into
> the scsi stack when it's already in place elsewhere makes no sense to me.

Well, it's a SCSI matter.

> > > If the LLDD is the type such that it knows the device is gone (aka, in
> > > my driver if I get a selection timeout then I know something is fishy
> > > and can proceed from there, iSCSI may not be so lucky), then it has one
> > > of two choices.  1) it may flush any commands that it can out of the
> > > hardware and return them immediately with the same error condition as
> > > the one that it is already returning.  2) it can sit and wait for the
> > > commands to timeout one by one if that's what it wants.  Since the
> > > device has already been marked offline by scsi_set_device_offline() and
> > > the error handler thread is already scheduled to run for the device, 2
> > > is probably the easiest thing for the driver to do.  The error handler
> > > will call the abort/reset
> >
> > Again not for USB and IEEE1394. We'd have to wait for the error handler
> > to finish. Doing it ourselves is easier.
>
> OK, are you reading my comments or not?  I said "since the error handler

Oh, I do. Some of them just seem impractical from a USB point of view.

> thread is already scheduled to run for the device, 2 is probably easiest".
> In other words, you don't have to wait for anything, it's gonna happen
> post-haste.  So since you should already have proper error handling
> functions in place (You do have proper error handler functions in place,
> don't you?), duplicating that code here won't really buy you anything.

I have memory to free. I can do that only after the last command is gone.
I'd have to yield in a loop and count commands. All quite messy.

> OK, let's look at this realistically.  I'm saying you get an interrupt
> telling you that the device is gone and you tell the scsi core the same
> thing.  Immediately after that the scsi core calls your error handler
> routines to clean up any pending commands on the device.  Once all those
> pending commands are cleaned up, the hot plug manager is free to remove
> the device from the system.  Once the hot plug manager calls for the free
> to happen, you get a slave_destroy() call and you free the instances.
> This all happens in a span of a few milliseconds most likely.  Is that
> really so inconvenient for you?

Yes. There is no such thing as "most likely". I have to code for the worst
case. So "most likely" means "maybe never". Either do or don't. We cannot
have a device removal fail for any reason. It drives up complexity by an
order of magnitude.

Besides having a callback going from kernel code through user space
back to kernel code is incredibly ugly.

	Regards
		Oliver


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 22:39                                                                   ` Oliver Neukum
@ 2003-01-23 23:23                                                                     ` Doug Ledford
  2003-01-23 23:25                                                                     ` Matthew Dharm
  1 sibling, 0 replies; 106+ messages in thread
From: Doug Ledford @ 2003-01-23 23:23 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

On Thu, Jan 23, 2003 at 11:39:40PM +0100, Oliver Neukum wrote:
> 
> > No.  You might be in a kernel thread context when you decode an interrupt
> > down to determining that a device was removed, but somewhere along the
> > line you took an interrupt that told you the device was removed (or else
> > the command simply timed out and you are in the error handler for the
> > command already).  Are you saying that the USB subsystem queues up those
> > interrupt packets and decodes them later (which is fine, I just want to be
> > clear on the point)?
> 
> Yes, an interrupt occurs, it's queued, a kernel thread woken up, it decodes
> the interrupts and notifies the device driver.

Fine.  In that code, when it detects a device being removed, it would do 
this:

spin_lock_irqsave(&device->host->host_lock);
scsi_set_device_offline(device);
spin_unlock_irqrestore(&device->host->host_lock);

and it will, from that point on, never get another command for that 
device.  Now, what would you normally do after that?

> > No.  That's what the user space hot plug manager is for.  If you want this
> > type of behaviour, you take an interrupt to tell you that the device is
> > gone, you mark it gone, the error handler cleans up any outstanding
> > commands, then once the device no longer has any commands outstanding
> > *then* the hot plug manager can successfully umount/unattach/whatever the
> > device and then tell the kernel to actually remove it.  Putting this into
> > the scsi stack when it's already in place elsewhere makes no sense to me.
> 
> Well, it's a SCSI matter.

Unmounting a mounted filesystem is not a scsi matter.  It's not even clear 
that it's what we want in all cases.  As I mentioned in a separate private 
email, I would find it cool as hell if I could mount my USB2.0 mp3 player 
that looks like a regular hard disk and have it configured as a permanent 
mount that simply deferred I/O errors indefinitely.  Then I would be free 
to write new files to the filesystem and if the device was plugged in they 
would get sent immediately and if it wasn't then the writes would just be 
buffered until the next time the hard disk was plugged in.  Then, when it 
did get a hotplug event, if there are any buffered up events in the device 
request queue we could just kick the request queue and everything would 
get written out. 

That would just be cool as hell, but not very feasible if I follow your 
plan.  That's why I think the user space hot plug manager should decide 
what to do in these cases.

> Oh, I do. Some of them just seem impractical from a USB point of view.
> 
> > thread is already scheduled to run for the device, 2 is probably easiest".
> > In other words, you don't have to wait for anything, it's gonna happen
> > post-haste.  So since you should already have proper error handling
> > functions in place (You do have proper error handler functions in place,
> > don't you?), duplicating that code here won't really buy you anything.
> 
> I have memory to free. I can do that only after the last command is gone.
> I'd have to yield in a loop and count commands. All quite messy.

No.  You would implement a slave_destroy() entry point in your driver and 
*forget* about counting commands or yielding in a loop.  It *simplifies* 
low level drivers when they use the facilities provided.  The 
infrastructure is all there, but if you want to reimplement it yourself in 
your driver, go right ahead.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 22:39                                                                   ` Oliver Neukum
  2003-01-23 23:23                                                                     ` Doug Ledford
@ 2003-01-23 23:25                                                                     ` Matthew Dharm
  2003-01-24 15:34                                                                       ` Alan Stern
                                                                                         ` (2 more replies)
  1 sibling, 3 replies; 106+ messages in thread
From: Matthew Dharm @ 2003-01-23 23:25 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Doug Ledford, Luben Tuikov, Alan Stern, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 3909 bytes --]

Well, I've been watching this go on for days.  I hate to weigh in now, but
I think someone needs to understand what the guy writing the code that is
facing this problem really wants.

First, let me say that a USB storage device shows up as a HBA.  That's
because some devices are actually USB/SCSI bridges.  But, since I work at
the 'emulated host' level, that's where I'm focused.

What I want:

I want to be able to free resources associated with a device within a
finite (and well bounded) amount of time after I am notified that the
device is gone.

I want to be able to inform the SCSI mid-layer, which will then inform
higher layers, that the device is gone.  This is so that all may deal with
this however they want.  I really don't care who does what, as long as we
don't crash.  This implies that all block-type drivers will need to become
hotplug aware, or the SCSI mid-layer will have to fake command failures.

I want to be able to do as little command-trickery as possible.  If I have
to do it, then that means the next hotplug-capable LLDD must do it also.
Duplication of code is bad -- it should all be handled in the mid-layer.

As yet, the interface I have to the SCSI mid-layer fails on all three
points here.

And now, some of my opinions on how this should all work:

It would be nice if the user informed us about removing the device before
they did it.  But we shouldn't crash if they don't.

I don't want to be hanging around after a device is gone, spinning my
wheels because some other part of the kernel can't handle the fact that the
device is gone.  My driver is a passthru between the a SCSI emualted host
and a physical USB device -- if my device is gone, I want to be out of
there.  (Oddly enough, I'm starting to think there may be a DoS attack here
if you force the LLDD to stay -- after all, it consumes memory....)

Remember, the physical plug doesn't ask me if it's okay, and I don't get to
ask the SCSI mid-layer if it's okay.  Yes, starting with the user clicking
to tell us would be nice, but I don't get to see that.  All I get to see is
an indication that the plug is pulled.

I don't really give a rat's a** about 'how SCSI works' or how it's
specified or CAM models or any of that.  I try to live in the real world as
much as possible.  In that world, I'm not asking to remove an HBA -- I'm
telling you it's been removed.  I can't call it back.  I can't even fake a
command (other than perhaps INQUIRY) in any meaningful way.  THERE IS
NOTHING I CAN DO BUT KEEP INSISTING THAT THE DEVICE IS GONE!

It would be nice if the user could inform various parts of the kernel that
this device was going away, and then all sorts of cleanup could happen.
But I really don't care -- all I'm trying to do is exit without a resource
leak under all circumstances.

It would be nice if the SCSI mid-layer kept track of what commands were in
what stages in who's queues.  After all, if I hot-unplug a PCI SCSI
controller, the controller really isn't going to be able to complete those
commands for us -- we have to assume that commands queued by a LLDD are
really just being sent to the hardware for queuing.  If that wasn't the
case, then having LLDD queue capability doesn't make sense.

Now, here's the kicker -- this is what I think Linus wants:

Linus said to me, with a degree of annoyance, that he doesn't want
usb-storage to keep any associations of departed devices with SCSI emulated
hosts.  That means that I need to be able to add and remove hosts at the
will of the end-user.  In the end, what drives the entire process is what
Linus' hand does when it's placed on his USB flashcard reader.

Matt

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

It was a new hope.
					-- Dust Puppy
User Friendly, 12/25/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 23:25                                                                     ` Matthew Dharm
@ 2003-01-24 15:34                                                                       ` Alan Stern
  2003-01-24 16:06                                                                         ` Oliver Neukum
                                                                                           ` (2 more replies)
  2003-01-24 19:10                                                                       ` Luben Tuikov
  2003-01-24 21:48                                                                       ` Doug Ledford
  2 siblings, 3 replies; 106+ messages in thread
From: Alan Stern @ 2003-01-24 15:34 UTC (permalink / raw)
  To: Matthew Dharm
  Cc: Oliver Neukum, Doug Ledford, Luben Tuikov, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

On Thu, 23 Jan 2003, Matthew Dharm wrote:

> What I want:
> 
> I want to be able to free resources associated with a device within a
> finite (and well bounded) amount of time after I am notified that the
> device is gone.
> 
> I want to be able to inform the SCSI mid-layer, which will then inform
> higher layers, that the device is gone.  This is so that all may deal with
> this however they want.  I really don't care who does what, as long as we
> don't crash.  This implies that all block-type drivers will need to become
> hotplug aware, or the SCSI mid-layer will have to fake command failures.
> 
> I want to be able to do as little command-trickery as possible.  If I have
> to do it, then that means the next hotplug-capable LLDD must do it also.
> Duplication of code is bad -- it should all be handled in the mid-layer.

Matt's current proposed patch has the USB LLDD calling
scsi_unregister_host()  (because the device is respresented as an emulated
host adapter) when it learns that the device is gone.  Provided that
routine doesn't block for very long, provided it handles all the details
of hotplug notifications, and provided it guarantees that after it returns
there will be no more calls to the emulated adapter, I don't see any
problem.  The LLDD can go ahead and remove all records of the former
device, secure in the knowledge that all pointers to data structures and
entry points have been erased.

Isn't this exactly what everyone has been asking for and debating about?

Alan Stern



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [PATCH] USB changes for 2.5.58
  2003-01-24 15:34                                                                       ` Alan Stern
@ 2003-01-24 16:06                                                                         ` Oliver Neukum
  2003-01-24 17:58                                                                         ` [linux-usb-devel] " Doug Ledford
  2003-01-24 19:00                                                                         ` Luben Tuikov
  2 siblings, 0 replies; 106+ messages in thread
From: Oliver Neukum @ 2003-01-24 16:06 UTC (permalink / raw)
  To: Alan Stern, Matthew Dharm
  Cc: Doug Ledford, Luben Tuikov, David Brownell, Mike Anderson,
	Greg KH, linux-usb-devel, Linux SCSI list


> Matt's current proposed patch has the USB LLDD calling
> scsi_unregister_host()  (because the device is respresented as an emulated
> host adapter) when it learns that the device is gone.  Provided that
> routine doesn't block for very long, provided it handles all the details
> of hotplug notifications, and provided it guarantees that after it returns
> there will be no more calls to the emulated adapter, I don't see any
> problem.  The LLDD can go ahead and remove all records of the former
> device, secure in the knowledge that all pointers to data structures and
> entry points have been erased.
>
> Isn't this exactly what everyone has been asking for and debating about?

If it only met the requirements, we'd be happy.
But it doesn't. By a large margin.

	Regards
		Oliver




-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 15:34                                                                       ` Alan Stern
  2003-01-24 16:06                                                                         ` Oliver Neukum
@ 2003-01-24 17:58                                                                         ` Doug Ledford
  2003-01-24 19:00                                                                         ` Luben Tuikov
  2 siblings, 0 replies; 106+ messages in thread
From: Doug Ledford @ 2003-01-24 17:58 UTC (permalink / raw)
  To: Alan Stern
  Cc: Matthew Dharm, Oliver Neukum, Luben Tuikov, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

On Fri, Jan 24, 2003 at 10:34:11AM -0500, Alan Stern wrote:
> On Thu, 23 Jan 2003, Matthew Dharm wrote:
> 
> > What I want:
> > 
> > I want to be able to free resources associated with a device within a
> > finite (and well bounded) amount of time after I am notified that the
> > device is gone.
> > 
> > I want to be able to inform the SCSI mid-layer, which will then inform
> > higher layers, that the device is gone.  This is so that all may deal with
> > this however they want.  I really don't care who does what, as long as we
> > don't crash.  This implies that all block-type drivers will need to become
> > hotplug aware, or the SCSI mid-layer will have to fake command failures.
> > 
> > I want to be able to do as little command-trickery as possible.  If I have
> > to do it, then that means the next hotplug-capable LLDD must do it also.
> > Duplication of code is bad -- it should all be handled in the mid-layer.
> 
> Matt's current proposed patch has the USB LLDD calling
> scsi_unregister_host()  (because the device is respresented as an emulated
> host adapter) when it learns that the device is gone.  Provided that
> routine doesn't block for very long, provided it handles all the details
> of hotplug notifications, and provided it guarantees that after it returns
> there will be no more calls to the emulated adapter,

There is no such guarantee of this.  Last I checked, 
scsi_unregister_host() can fail if devices are busy.  In fact, I would say 
that's exactly what happened on my system when I unplugged a USB2.0 CD-ROM 
that was in use.  The USB stack called this code, it failed due to busy, 
USB stack didn't care and blew state away, I knew have a permanent CD-ROM 
device that I can no longer clean up and I had to reboot my machine.

> I don't see any
> problem.  The LLDD can go ahead and remove all records of the former
> device, secure in the knowledge that all pointers to data structures and
> entry points have been erased.
> 
> Isn't this exactly what everyone has been asking for and debating about?
> 
> Alan Stern
> 

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606
  

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 15:34                                                                       ` Alan Stern
  2003-01-24 16:06                                                                         ` Oliver Neukum
  2003-01-24 17:58                                                                         ` [linux-usb-devel] " Doug Ledford
@ 2003-01-24 19:00                                                                         ` Luben Tuikov
  2003-01-24 22:23                                                                           ` Oliver.Neukum
  2 siblings, 1 reply; 106+ messages in thread
From: Luben Tuikov @ 2003-01-24 19:00 UTC (permalink / raw)
  To: Alan Stern
  Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

Alan Stern wrote:
 >
> Matt's current proposed patch has the USB LLDD calling
> scsi_unregister_host()  (because the device is respresented as an emulated

A LLDD should and must *not* call scsi_unregister_host().  This brakes
all hierarchy.

Let's not make a distinction between a USB LLDD and any LLDD wrt hotplugging,
else we'll have a big mess and plenty of code duplication.

When a device gets unplugged and the LLDD notices it, it will set the device
to off in its own tables, call scsi_set_device_offline(dev) and from that
point on *if* any commands for that device step in through queuecommand()
method, they will return with the appropriate error.

scsi_set_device_offline(dev) will do whatever it has to do, which IMHO
is to, as Doug suggested, start error recovery (i.e. less code, less code
duplication in LLDD), and call an upper level hook. *

* Though I'm not quite certain where that error recovery should
be started... Maybe that upper level hook after doing whatever it
has to do, it will start the error recovery, or maybe it doesn't matter
for now.  But, an upper level hook call is, IMHO, mandatory for many
reasons.

Now, here we have the alternatives: scsi_set_device_offline() calls
slave_destroy(), or slave_destory() is called later on when upper
level structs has been cleaned.

In fact, it wouldn't matter, if SCSI Core takes over error return,
via post scsi_set_device_offline(dev) call.

> host adapter) when it learns that the device is gone.  Provided that
> routine doesn't block for very long, provided it handles all the details
> of hotplug notifications, and provided it guarantees that after it returns
> there will be no more calls to the emulated adapter, I don't see any
> problem.  The LLDD can go ahead and remove all records of the former
> device, secure in the knowledge that all pointers to data structures and
> entry points have been erased.

A LLDD shouldn't be so concerned with ``blocking'', unless *it* sets timers
on calls to an upper level like SCSI Core functions.

A LLDD has quite a clean and clear function it's supposed to do.

-- 
Luben

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 19:00                                                                         ` Luben Tuikov
@ 2003-01-24 22:23                                                                           ` Oliver.Neukum
  0 siblings, 0 replies; 106+ messages in thread
From: Oliver.Neukum @ 2003-01-24 22:23 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Alan Stern, Matthew Dharm, Oliver Neukum, Doug Ledford,
	David Brownell, Mike Anderson, Greg KH, linux-usb-devel,
	Linux SCSI list

On Fri, 24 Jan 2003, Luben Tuikov wrote:

> Alan Stern wrote:
>  >
> > Matt's current proposed patch has the USB LLDD calling
> > scsi_unregister_host()  (because the device is respresented as an emulated
>
> A LLDD should and must *not* call scsi_unregister_host().  This brakes
> all hierarchy.

What then is supposed to happen when you remove a PCMCIA host adapter or
a SCSI to USB bridge or ... ?
There must be a way to report that a host was unplugged.

	Regards
		Oliver



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [PATCH] USB changes for 2.5.58
  2003-01-23 23:25                                                                     ` Matthew Dharm
  2003-01-24 15:34                                                                       ` Alan Stern
@ 2003-01-24 19:10                                                                       ` Luben Tuikov
  2003-01-24 19:56                                                                         ` [linux-usb-devel] " Alan Stern
  2003-01-24 21:48                                                                       ` Doug Ledford
  2 siblings, 1 reply; 106+ messages in thread
From: Luben Tuikov @ 2003-01-24 19:10 UTC (permalink / raw)
  To: Matthew Dharm
  Cc: Oliver Neukum, Doug Ledford, Alan Stern, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

Matthew Dharm wrote:
> 
> It would be nice if the SCSI mid-layer kept track of what commands were in
> what stages in who's queues.

My mini-scsi-core does exactly that.  Moving commands between queues is atomic.
The whole thing is completely reentrant and multithreaded capable, etc, etc.
It has a simple interface of send_command() and cancel_command(); doesn't have
device discovery though.

I tried to sell those features/ideas to SCSI Core, but we're not there yet.

Oh, I forgot to mention in my previous by date letter:
The entity which calls scsi_register_host() should by good design call
scsi_unregister_host().  It may be the case that when there are no
more devices associated with a particular host, then SCSI Core can
call scsi_unregister_host(), reminicent of SCSI Core early initialization.

-- 
Luben

-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 19:10                                                                       ` Luben Tuikov
@ 2003-01-24 19:56                                                                         ` Alan Stern
  2003-01-24 20:11                                                                           ` Luben Tuikov
  2003-01-24 21:09                                                                           ` Luben Tuikov
  0 siblings, 2 replies; 106+ messages in thread
From: Alan Stern @ 2003-01-24 19:56 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

Are you aware that you are contradicting yourself?

On Fri, 24 Jan 2003, Luben Tuikov wrote:

> A LLDD should and must *not* call scsi_unregister_host().  This brakes
> all hierarchy.

On Fri, 24 Jan 2003, Luben Tuikov wrote:

> Oh, I forgot to mention in my previous by date letter:
> The entity which calls scsi_register_host() should by good design call
> scsi_unregister_host().

In usb-storage, it _is_ the LLDD that calls scsi_register_host().

Alan Stern


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 19:56                                                                         ` [linux-usb-devel] " Alan Stern
@ 2003-01-24 20:11                                                                           ` Luben Tuikov
  2003-01-24 21:09                                                                           ` Luben Tuikov
  1 sibling, 0 replies; 106+ messages in thread
From: Luben Tuikov @ 2003-01-24 20:11 UTC (permalink / raw)
  To: Alan Stern
  Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

Alan Stern wrote:
> Are you aware that you are contradicting yourself?

Oops, yes, sorry, I was thinking about something else, .... doh!
Correction noted.

I've been known to do things like this -- think about something else
and write total crap :-) .

-- 
Luben

P.S. Plus, it's Friday and after a beer at lunch... this is
what we get :-)





^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 19:56                                                                         ` [linux-usb-devel] " Alan Stern
  2003-01-24 20:11                                                                           ` Luben Tuikov
@ 2003-01-24 21:09                                                                           ` Luben Tuikov
  2003-01-24 21:55                                                                             ` Alan Stern
  1 sibling, 1 reply; 106+ messages in thread
From: Luben Tuikov @ 2003-01-24 21:09 UTC (permalink / raw)
  To: Alan Stern
  Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

Alan Stern wrote:
> Are you aware that you are contradicting yourself?
> 
> On Fri, 24 Jan 2003, Luben Tuikov wrote:
> 
> 
>>A LLDD should and must *not* call scsi_unregister_host().  This brakes
>>all hierarchy.
> 

What I probably meant is the detect()/release() pair; release() itself
normally calls scsi_unregister(host); the point is that it got nudged
from ``above'', i.e. SCSI Core.

How can a LLDD be certain that it can safely call scsi_unregister_host()
whenever it wishes?  As Doug pointed out this leads to problems.

Furhtermore, are we talking about scsi_unregister_host() or
scsi_unregister(host)?  The former does drivers and the latter
does hosts.  This would mean that my original statement was
nevertheless correct, how can a LLDD decide to unload itself safely?

-- 
Luben

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 21:09                                                                           ` Luben Tuikov
@ 2003-01-24 21:55                                                                             ` Alan Stern
  2003-01-24 22:03                                                                               ` Luben Tuikov
  2003-01-24 23:21                                                                               ` Mike Anderson
  0 siblings, 2 replies; 106+ messages in thread
From: Alan Stern @ 2003-01-24 21:55 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

On Fri, 24 Jan 2003, Luben Tuikov wrote:

> >>A LLDD should and must *not* call scsi_unregister_host().  This brakes
> >>all hierarchy.
> > 
> 
> What I probably meant is the detect()/release() pair; release() itself
> normally calls scsi_unregister(host); the point is that it got nudged
> from ``above'', i.e. SCSI Core.
> 
> How can a LLDD be certain that it can safely call scsi_unregister_host()
> whenever it wishes?  As Doug pointed out this leads to problems.

Apparently it can't.  I don't mean to say that this was the right thing to 
do; I just meant that this is what Matt's currently-proposed patch does.  
Personally, I'm not very familiar with the details of the SCSI subsystem, 
and I don't know what preconditions are required for calling the various 
API's.

> Furhtermore, are we talking about scsi_unregister_host() or
> scsi_unregister(host)?  The former does drivers and the latter
> does hosts.  This would mean that my original statement was
> nevertheless correct, how can a LLDD decide to unload itself safely?

I did indeed type it wrong.  The code first calls scsi_remove_host(host)  
and then it calls scsi_unregister(host).

Alan Stern


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 21:55                                                                             ` Alan Stern
@ 2003-01-24 22:03                                                                               ` Luben Tuikov
  2003-01-24 23:21                                                                               ` Mike Anderson
  1 sibling, 0 replies; 106+ messages in thread
From: Luben Tuikov @ 2003-01-24 22:03 UTC (permalink / raw)
  To: Alan Stern
  Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

Alan Stern wrote:
> 
> Apparently it can't.  I don't mean to say that this was the right thing to 
> do; I just meant that this is what Matt's currently-proposed patch does.  
> Personally, I'm not very familiar with the details of the SCSI subsystem, 
> and I don't know what preconditions are required for calling the various 
> API's.

Ok, no problem.

Doug just explained it very good in his most recent email.  Take heed in
his words.  Everything he said is quite doable.

-- 
Luben




^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 21:55                                                                             ` Alan Stern
  2003-01-24 22:03                                                                               ` Luben Tuikov
@ 2003-01-24 23:21                                                                               ` Mike Anderson
  1 sibling, 0 replies; 106+ messages in thread
From: Mike Anderson @ 2003-01-24 23:21 UTC (permalink / raw)
  To: Alan Stern
  Cc: Luben Tuikov, Matthew Dharm, Oliver Neukum, Doug Ledford,
	David Brownell, Greg KH, linux-usb-devel, Linux SCSI list

Alan Stern [stern@rowland.harvard.edu] wrote:
> On Fri, 24 Jan 2003, Luben Tuikov wrote:
> 
> > >>A LLDD should and must *not* call scsi_unregister_host().  This brakes
> > >>all hierarchy.
> > > 
> > 
> > What I probably meant is the detect()/release() pair; release() itself
> > normally calls scsi_unregister(host); the point is that it got nudged
> > from ``above'', i.e. SCSI Core.
> > 
> > How can a LLDD be certain that it can safely call scsi_unregister_host()
> > whenever it wishes?  As Doug pointed out this leads to problems.
> 
> Apparently it can't.  I don't mean to say that this was the right thing to 
> do; I just meant that this is what Matt's currently-proposed patch does.  
> Personally, I'm not very familiar with the details of the SCSI subsystem, 
> and I don't know what preconditions are required for calling the various 
> API's.
> 
> > Furhtermore, are we talking about scsi_unregister_host() or
> > scsi_unregister(host)?  The former does drivers and the latter
> > does hosts.  This would mean that my original statement was
> > nevertheless correct, how can a LLDD decide to unload itself safely?
> 
> I did indeed type it wrong.  The code first calls scsi_remove_host(host)  
> and then it calls scsi_unregister(host).
> 

I probably should have looked at Matt's patch closer, sorry. If a LLDD is
going to be using scsi_add_host and scsi_remove_host the driver should
not use scsi_register_host / scsi_unregister_host. 

If the driver is updated to the sysfs driver model then:

	1.) The drivers probe routine should call into the scsi mid with.
		scsi_register(...);
		scsi_add_host(...);

	2.) The drivers remove routine should call into the scsi mid with.

		scsi_remove_host(...);
		scsi_unregister(...);

	(scsi_remove_host is part of this current discussion).

The event that calls probe / remove is the device_register /
device_unregister of the the adapter device.

The LLDD's device_initcall / module_exit routines will call
driver_register and driver_unregister to cause device to driver
binding. Which will cause probe / remove to be called.

-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 23:25                                                                     ` Matthew Dharm
  2003-01-24 15:34                                                                       ` Alan Stern
  2003-01-24 19:10                                                                       ` Luben Tuikov
@ 2003-01-24 21:48                                                                       ` Doug Ledford
  2003-01-24 22:59                                                                         ` Mike Anderson
  2003-01-24 23:25                                                                         ` Matthew Dharm
  2 siblings, 2 replies; 106+ messages in thread
From: Doug Ledford @ 2003-01-24 21:48 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

On Thu, Jan 23, 2003 at 03:25:54PM -0800, Matthew Dharm wrote:
> Well, I've been watching this go on for days.  I hate to weigh in now, but
> I think someone needs to understand what the guy writing the code that is
> facing this problem really wants.
> 
> First, let me say that a USB storage device shows up as a HBA.  That's
> because some devices are actually USB/SCSI bridges.  But, since I work at
> the 'emulated host' level, that's where I'm focused.
> 
> What I want:
> 
> I want to be able to free resources associated with a device within a
> finite (and well bounded) amount of time after I am notified that the
> device is gone.

This is doable.  Partly.  You need to do it within the framework of what 
the scsi subsys needs, but it can be done.  Basically, the scsi core has 
an object allocation oriented paridigm where the lldd is expected to act 
as a driver that handles creation, use, and destruction of objects at the 
mid layer's direction.  Changing that paridigm would be very difficult.  
However, we don't need the usb subsys to maintain full state on the device 
once it has gone away, just enough state to be able to talk to the mid 
layer and tell it where the outstanding commands are.  Once the mid layer 
is finally done with it, the mid layer will tell your subsys to do a final 
free of the device by calling your release routine (for hosts) or your 
slave_destroy routine (for devices).  So, what you should be doing in 
order to be both a nice scsi host that plays well with the generic 
mechanism we have in place is when you get this removal event, you should 
be free'ing all the state you needed about the usb bus and such and taking 
this usb device off line or whatever you do.  Then let the scsi mid layer 
clean up at it's leisure.  You don't need to worry about it because the 
only thing you will have left is to wait for the scsi subsys to call you 
when it's time to delete things.  You don't even have to keep device 
references around because we pass those in to your deletion routines 
anyway.

> I want to be able to inform the SCSI mid-layer, which will then inform
> higher layers, that the device is gone.

scsi_set_device_offline() as we've been discussing.

>  This is so that all may deal with
> this however they want.  I really don't care who does what, as long as we
> don't crash.  This implies that all block-type drivers will need to become
> hotplug aware, or the SCSI mid-layer will have to fake command failures.

As I stated in my proposed design email, this will already be taken care 
of.  As long as you call scsi_set_device_offline() while holding the host 
lock for this host, then there will be no races between this call and the 
queuecommand() call which should be *all* that you care about.  You will 
no longer get commands for this device.  Now let me inform you of what the 
scsi subsys cares about.  You need to return *all* outstanding commands 
that this device has in your lldd before you go off and play "my device is 
gone so I'm free'ing everything".  That hasn't been the case from what 
I've seen, and that's what the scsi subsys needs.  If you don't return 
absolutely *all* outstanding commands before you go around free'ing stuff, 
then what happens is the scsi subsys needs those commands back in order to 
free up the scsi structs and it calls into the usb stack to get a reset or 
abort on the command and you guys have already free'd up your stuff and 
you don't claim to know what we are talking about.  At that point, we have 
a permanently stuck device.  This is the part where I said yesterday that 
you could handle this at the time you set the device offline by adding the 
code to return all the commands or you could just let your already present 
error handler code get called by the error handler thread and return the 
outstanding commands that way.  But, one way or the other, it has to be 
done.

> I want to be able to do as little command-trickery as possible.  If I have
> to do it, then that means the next hotplug-capable LLDD must do it also.
> Duplication of code is bad -- it should all be handled in the mid-layer.

No one has to do anything of the sort if you follow what I wrote.  Once 
the device is marked offline, any further commands already present in the 
request queue but not yet sent to the device plus any commands that come 
in to the request queue will all be sent back as I/O errors immediately 
before ever coming to you.

> As yet, the interface I have to the SCSI mid-layer fails on all three
> points here.

SCSI <-> USB interaction is wrong right now.  We know that.

> And now, some of my opinions on how this should all work:
> 
> It would be nice if the user informed us about removing the device before
> they did it.  But we shouldn't crash if they don't.

Correct.

> I don't want to be hanging around after a device is gone, spinning my
> wheels because some other part of the kernel can't handle the fact that the
> device is gone.  My driver is a passthru between the a SCSI emualted host
> and a physical USB device -- if my device is gone, I want to be out of
> there.  (Oddly enough, I'm starting to think there may be a DoS attack here
> if you force the LLDD to stay -- after all, it consumes memory....)

The device will go away.  But, because we clean up multiple things on a 
host, like an error handler thread that needs to be woke up and we have to 
wait for it to run and acknowledge the death, instantaneous removal of all 
the structs is simply unrealistic.  Unplugging your internal structs from 
the actual bus and then letting the scsi subsys clean them up at it's 
leisure is possible though, and that's what I'm asking for.

> Remember, the physical plug doesn't ask me if it's okay, and I don't get to
> ask the SCSI mid-layer if it's okay.  Yes, starting with the user clicking
> to tell us would be nice, but I don't get to see that.  All I get to see is
> an indication that the plug is pulled.

I don't disagree.  But because a plug is pulled has nothing to do with 
objects that might hold a reference to your object.  Just because you 
*want* it that way doesn't mean it's realistic to actually try and make it 
that way.  We can delete an object only after the last reference is 
released.

> I don't really give a rat's a** about 'how SCSI works' or how it's
> specified or CAM models or any of that.  I try to live in the real world as
> much as possible.  In that world, I'm not asking to remove an HBA -- I'm
> telling you it's been removed.  I can't call it back.  I can't even fake a
> command (other than perhaps INQUIRY) in any meaningful way.  THERE IS
> NOTHING I CAN DO BUT KEEP INSISTING THAT THE DEVICE IS GONE!

I wouldn't suggest otherwise.  I'm saying that in the real world of 
software design, it doesn't matter if your device is gone or not, if there 
are still references outstanding to the object then a free of that object 
is a software bug.  Come on now, quit being a prick about all this because 
you know I'm right on that point.  I'm not asking you to leave shit around 
forever, I'm asking you guys to work within the generic framework of 
object allocation and freeing that already exists in the scsi subsys to 
get things done.  The real problem is that you guys want this clean up 
process to be syncronous and I'm telling you that it's ever so slightly 
async because we have a couple schedules that have to take place and 
possibly a request queue to clean out.  No one seems willing to accept 
that a slightly async process is going to be OK.  Whatever.

> It would be nice if the user could inform various parts of the kernel that
> this device was going away, and then all sorts of cleanup could happen.
> But I really don't care -- all I'm trying to do is exit without a resource
> leak under all circumstances.

And so you shall if you follow the guidelines I set forth.

> It would be nice if the SCSI mid-layer kept track of what commands were in
> what stages in who's queues.  After all, if I hot-unplug a PCI SCSI
> controller, the controller really isn't going to be able to complete those
> commands for us -- we have to assume that commands queued by a LLDD are
> really just being sent to the hardware for queuing.  If that wasn't the
> case, then having LLDD queue capability doesn't make sense.

See, now you're just being a prick again.  You know the reason that the 
scsi mid layer *makes* the lldd tell us it is done with the commands is 
because if we go around saying "oh, this command was outstanding, let's 
free it" without getting clearance from the lldd then the lldd might still 
yet have state associated with the command.  We can't *ASSUME* jack crap 
in the mid layer.  You have to *TELL* us that you are done with a command.  
It is the only way to avoid races, resource leaks, and all sorts of other 
crap.  It amazes me that you think you should be excused from the job of 
cleaning up your queues after something happens like that.  Maybe the USB 
code doesn't allocate any internal command structs to go along with each 
scsi command, but pretty much every real scsi driver does and it's 
imperative that the driver go through all those commands and clean up 
after an event such as a removal, so what's the big deal about calling 
scsi_done() to *tell* us that you've cleaned up?

> Now, here's the kicker -- this is what I think Linus wants:
> 
> Linus said to me, with a degree of annoyance, that he doesn't want
> usb-storage to keep any associations of departed devices with SCSI emulated
> hosts.  That means that I need to be able to add and remove hosts at the
> will of the end-user.  In the end, what drives the entire process is what
> Linus' hand does when it's placed on his USB flashcard reader.

Well, the few milliseconds it might take to properly clean this stuff out
is well within the specifications of how fast a human can go around
plugging and unplugging a flashcard reader, so I'm not concerned that my
proposal doesn't meet these requirements.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [PATCH] USB changes for 2.5.58
  2003-01-24 21:48                                                                       ` Doug Ledford
@ 2003-01-24 22:59                                                                         ` Mike Anderson
  2003-01-24 23:17                                                                           ` [linux-usb-devel] " Doug Ledford
  2003-01-25  0:24                                                                           ` Luben Tuikov
  2003-01-24 23:25                                                                         ` Matthew Dharm
  1 sibling, 2 replies; 106+ messages in thread
From: Mike Anderson @ 2003-01-24 22:59 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

Doug,
	I started writing the interface you put forth in your email. I
	am currently debugging it in UML so I can generate the error
	conditions in a control manner. I still have some stuff to look
	at in the error handler with it running in this mode as it
	previously expected no one else to be possibly doing operations
	on the host. This could be the case if other LLDD's use this
	interface and have another device that happens to timeout an IO
	post a device being set offline.
	
A clarification question below.


Doug Ledford [dledford@redhat.com] wrote:

> So, what you should be doing in 
> order to be both a nice scsi host that plays well with the generic 
> mechanism we have in place is when you get this removal event, you should 
> be free'ing all the state you needed about the usb bus and such and taking 
> this usb device off line or whatever you do.  Then let the scsi mid layer 
> clean up at it's leisure.  You don't need to worry about it because the 
> only thing you will have left is to wait for the scsi subsys to call you 
> when it's time to delete things.  You don't even have to keep device 
> references around because we pass those in to your deletion routines 
> anyway.
> 
> > I want to be able to inform the SCSI mid-layer, which will then inform
> > higher layers, that the device is gone.
> 
> scsi_set_device_offline() as we've been discussing.
> 

I assumed that the hotplug event would only come from this function if
no commands where outstanding. If there where commands outstanding the
event would not be generated until the error handler gained ownership of
all the commands.

-andmike
--
Michael Anderson
andmike@us.ibm.com



-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 22:59                                                                         ` Mike Anderson
@ 2003-01-24 23:17                                                                           ` Doug Ledford
  2003-01-25  0:24                                                                           ` Luben Tuikov
  1 sibling, 0 replies; 106+ messages in thread
From: Doug Ledford @ 2003-01-24 23:17 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

On Fri, Jan 24, 2003 at 02:59:10PM -0800, Mike Anderson wrote:
> Doug,
> 	I started writing the interface you put forth in your email. I
> 	am currently debugging it in UML so I can generate the error
> 	conditions in a control manner.

Very cool!  Thanks Mike.

>	 I still have some stuff to look
> 	at in the error handler with it running in this mode as it
> 	previously expected no one else to be possibly doing operations
> 	on the host. This could be the case if other LLDD's use this
> 	interface and have another device that happens to timeout an IO
> 	post a device being set offline.

Unless someone changed things behind my back, we still have on eh thread 
per host don't we?  As such, since each device is it's own host in USB 
(and I think in ieee1394 as well), this shouldn't be an issue...

> A clarification question below.
> 
> 
> Doug Ledford [dledford@redhat.com] wrote:
> 
> > So, what you should be doing in 
> > order to be both a nice scsi host that plays well with the generic 
> > mechanism we have in place is when you get this removal event, you should 
> > be free'ing all the state you needed about the usb bus and such and taking 
> > this usb device off line or whatever you do.  Then let the scsi mid layer 
> > clean up at it's leisure.  You don't need to worry about it because the 
> > only thing you will have left is to wait for the scsi subsys to call you 
> > when it's time to delete things.  You don't even have to keep device 
> > references around because we pass those in to your deletion routines 
> > anyway.
> > 
> > > I want to be able to inform the SCSI mid-layer, which will then inform
> > > higher layers, that the device is gone.
> > 
> > scsi_set_device_offline() as we've been discussing.
> > 
> 
> I assumed that the hotplug event would only come from this function if
> no commands where outstanding. If there where commands outstanding the
> event would not be generated until the error handler gained ownership of
> all the commands.

Yes.  But more than that, we really want to also make sure that the 
current request queue for the device is empty of all commands before 
sending the hot plug event.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606
  

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 22:59                                                                         ` Mike Anderson
  2003-01-24 23:17                                                                           ` [linux-usb-devel] " Doug Ledford
@ 2003-01-25  0:24                                                                           ` Luben Tuikov
  2003-01-25  1:35                                                                             ` Mike Anderson
  1 sibling, 1 reply; 106+ messages in thread
From: Luben Tuikov @ 2003-01-25  0:24 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Oliver Neukum, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

Mike Anderson wrote:
> Doug,
> 	I started writing the interface you put forth in your email.

Do you mind clarifying?  Either it was a private email, or
one posted here, in which case there was an interpretation.

-- 
Luben




^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-25  0:24                                                                           ` Luben Tuikov
@ 2003-01-25  1:35                                                                             ` Mike Anderson
  0 siblings, 0 replies; 106+ messages in thread
From: Mike Anderson @ 2003-01-25  1:35 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Oliver Neukum, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

Luben Tuikov [luben@splentec.com] wrote:
> Mike Anderson wrote:
> >Doug,
> >	I started writing the interface you put forth in your email.
> 
> Do you mind clarifying?  Either it was a private email, or
> one posted here, in which case there was an interpretation.

It was posted here at the bottom of this email
http://marc.theaimsgroup.com/?l=linux-scsi&m=104335366403485&w=2

It is a starting point. 

-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 21:48                                                                       ` Doug Ledford
  2003-01-24 22:59                                                                         ` Mike Anderson
@ 2003-01-24 23:25                                                                         ` Matthew Dharm
  2003-01-25  0:05                                                                           ` Doug Ledford
  2003-01-25  1:24                                                                           ` Luben Tuikov
  1 sibling, 2 replies; 106+ messages in thread
From: Matthew Dharm @ 2003-01-24 23:25 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 12830 bytes --]

So, if I read this correctly, you're saying that the correct sequence is:

(1) get disconnect notification from USB
(2) Call scsi_set_device_offline() (must hold host lock for this)
(3) call scsi_done() for all command in queue (max: 1)
(4) Call scsi_remove_host(), which should now work because no commands are
outstanding
(5) Call scsi_unregister()

And we're done, all structures can be freed.  And, as I understand it, the
following is true:

(a) once (2) is done, no more commands will be queued
(b) once (3) is done, (4) is guaranteed to work
(c) there is nothing the user can do to make this sequence take a long time

Tho, this does leave me with a couple of questions:

(i) Doesn't scsi_set_device_offline() work on devices, not hosts?  How do I
map from my host to my device list?
(ii) Do I need to call scsi_set_device_offline() for each device?  I
presume 'yes'.
(iii) What should I shove into the status field of the scsi command before
I scsi_done() it?

Oh, and as for my being a 'prick'.... my big problem is that the documented
interface is synchronous.  Async is fine with me, but up until this e-mail,
all I've seen is people arguing over what the sequence is, and theoretical
issues of what users should and should not do.  And I also think that a
large number of hotplugable hosts are going to replicate a whole bunch of
code to do (2)+(3)+(4) in one, synchronous burst.

If someone will step forward with a 'yes' or 'no' on this sequence, then
I'll get it done.  If the answer is 'no', then what did I miss?

Matt

On Fri, Jan 24, 2003 at 04:48:31PM -0500, Doug Ledford wrote:
> On Thu, Jan 23, 2003 at 03:25:54PM -0800, Matthew Dharm wrote:
> > Well, I've been watching this go on for days.  I hate to weigh in now, but
> > I think someone needs to understand what the guy writing the code that is
> > facing this problem really wants.
> > 
> > First, let me say that a USB storage device shows up as a HBA.  That's
> > because some devices are actually USB/SCSI bridges.  But, since I work at
> > the 'emulated host' level, that's where I'm focused.
> > 
> > What I want:
> > 
> > I want to be able to free resources associated with a device within a
> > finite (and well bounded) amount of time after I am notified that the
> > device is gone.
> 
> This is doable.  Partly.  You need to do it within the framework of what 
> the scsi subsys needs, but it can be done.  Basically, the scsi core has 
> an object allocation oriented paridigm where the lldd is expected to act 
> as a driver that handles creation, use, and destruction of objects at the 
> mid layer's direction.  Changing that paridigm would be very difficult.  
> However, we don't need the usb subsys to maintain full state on the device 
> once it has gone away, just enough state to be able to talk to the mid 
> layer and tell it where the outstanding commands are.  Once the mid layer 
> is finally done with it, the mid layer will tell your subsys to do a final 
> free of the device by calling your release routine (for hosts) or your 
> slave_destroy routine (for devices).  So, what you should be doing in 
> order to be both a nice scsi host that plays well with the generic 
> mechanism we have in place is when you get this removal event, you should 
> be free'ing all the state you needed about the usb bus and such and taking 
> this usb device off line or whatever you do.  Then let the scsi mid layer 
> clean up at it's leisure.  You don't need to worry about it because the 
> only thing you will have left is to wait for the scsi subsys to call you 
> when it's time to delete things.  You don't even have to keep device 
> references around because we pass those in to your deletion routines 
> anyway.
> 
> > I want to be able to inform the SCSI mid-layer, which will then inform
> > higher layers, that the device is gone.
> 
> scsi_set_device_offline() as we've been discussing.
> 
> >  This is so that all may deal with
> > this however they want.  I really don't care who does what, as long as we
> > don't crash.  This implies that all block-type drivers will need to become
> > hotplug aware, or the SCSI mid-layer will have to fake command failures.
> 
> As I stated in my proposed design email, this will already be taken care 
> of.  As long as you call scsi_set_device_offline() while holding the host 
> lock for this host, then there will be no races between this call and the 
> queuecommand() call which should be *all* that you care about.  You will 
> no longer get commands for this device.  Now let me inform you of what the 
> scsi subsys cares about.  You need to return *all* outstanding commands 
> that this device has in your lldd before you go off and play "my device is 
> gone so I'm free'ing everything".  That hasn't been the case from what 
> I've seen, and that's what the scsi subsys needs.  If you don't return 
> absolutely *all* outstanding commands before you go around free'ing stuff, 
> then what happens is the scsi subsys needs those commands back in order to 
> free up the scsi structs and it calls into the usb stack to get a reset or 
> abort on the command and you guys have already free'd up your stuff and 
> you don't claim to know what we are talking about.  At that point, we have 
> a permanently stuck device.  This is the part where I said yesterday that 
> you could handle this at the time you set the device offline by adding the 
> code to return all the commands or you could just let your already present 
> error handler code get called by the error handler thread and return the 
> outstanding commands that way.  But, one way or the other, it has to be 
> done.
> 
> > I want to be able to do as little command-trickery as possible.  If I have
> > to do it, then that means the next hotplug-capable LLDD must do it also.
> > Duplication of code is bad -- it should all be handled in the mid-layer.
> 
> No one has to do anything of the sort if you follow what I wrote.  Once 
> the device is marked offline, any further commands already present in the 
> request queue but not yet sent to the device plus any commands that come 
> in to the request queue will all be sent back as I/O errors immediately 
> before ever coming to you.
> 
> > As yet, the interface I have to the SCSI mid-layer fails on all three
> > points here.
> 
> SCSI <-> USB interaction is wrong right now.  We know that.
> 
> > And now, some of my opinions on how this should all work:
> > 
> > It would be nice if the user informed us about removing the device before
> > they did it.  But we shouldn't crash if they don't.
> 
> Correct.
> 
> > I don't want to be hanging around after a device is gone, spinning my
> > wheels because some other part of the kernel can't handle the fact that the
> > device is gone.  My driver is a passthru between the a SCSI emualted host
> > and a physical USB device -- if my device is gone, I want to be out of
> > there.  (Oddly enough, I'm starting to think there may be a DoS attack here
> > if you force the LLDD to stay -- after all, it consumes memory....)
> 
> The device will go away.  But, because we clean up multiple things on a 
> host, like an error handler thread that needs to be woke up and we have to 
> wait for it to run and acknowledge the death, instantaneous removal of all 
> the structs is simply unrealistic.  Unplugging your internal structs from 
> the actual bus and then letting the scsi subsys clean them up at it's 
> leisure is possible though, and that's what I'm asking for.
> 
> > Remember, the physical plug doesn't ask me if it's okay, and I don't get to
> > ask the SCSI mid-layer if it's okay.  Yes, starting with the user clicking
> > to tell us would be nice, but I don't get to see that.  All I get to see is
> > an indication that the plug is pulled.
> 
> I don't disagree.  But because a plug is pulled has nothing to do with 
> objects that might hold a reference to your object.  Just because you 
> *want* it that way doesn't mean it's realistic to actually try and make it 
> that way.  We can delete an object only after the last reference is 
> released.
> 
> > I don't really give a rat's a** about 'how SCSI works' or how it's
> > specified or CAM models or any of that.  I try to live in the real world as
> > much as possible.  In that world, I'm not asking to remove an HBA -- I'm
> > telling you it's been removed.  I can't call it back.  I can't even fake a
> > command (other than perhaps INQUIRY) in any meaningful way.  THERE IS
> > NOTHING I CAN DO BUT KEEP INSISTING THAT THE DEVICE IS GONE!
> 
> I wouldn't suggest otherwise.  I'm saying that in the real world of 
> software design, it doesn't matter if your device is gone or not, if there 
> are still references outstanding to the object then a free of that object 
> is a software bug.  Come on now, quit being a prick about all this because 
> you know I'm right on that point.  I'm not asking you to leave shit around 
> forever, I'm asking you guys to work within the generic framework of 
> object allocation and freeing that already exists in the scsi subsys to 
> get things done.  The real problem is that you guys want this clean up 
> process to be syncronous and I'm telling you that it's ever so slightly 
> async because we have a couple schedules that have to take place and 
> possibly a request queue to clean out.  No one seems willing to accept 
> that a slightly async process is going to be OK.  Whatever.
> 
> > It would be nice if the user could inform various parts of the kernel that
> > this device was going away, and then all sorts of cleanup could happen.
> > But I really don't care -- all I'm trying to do is exit without a resource
> > leak under all circumstances.
> 
> And so you shall if you follow the guidelines I set forth.
> 
> > It would be nice if the SCSI mid-layer kept track of what commands were in
> > what stages in who's queues.  After all, if I hot-unplug a PCI SCSI
> > controller, the controller really isn't going to be able to complete those
> > commands for us -- we have to assume that commands queued by a LLDD are
> > really just being sent to the hardware for queuing.  If that wasn't the
> > case, then having LLDD queue capability doesn't make sense.
> 
> See, now you're just being a prick again.  You know the reason that the 
> scsi mid layer *makes* the lldd tell us it is done with the commands is 
> because if we go around saying "oh, this command was outstanding, let's 
> free it" without getting clearance from the lldd then the lldd might still 
> yet have state associated with the command.  We can't *ASSUME* jack crap 
> in the mid layer.  You have to *TELL* us that you are done with a command.  
> It is the only way to avoid races, resource leaks, and all sorts of other 
> crap.  It amazes me that you think you should be excused from the job of 
> cleaning up your queues after something happens like that.  Maybe the USB 
> code doesn't allocate any internal command structs to go along with each 
> scsi command, but pretty much every real scsi driver does and it's 
> imperative that the driver go through all those commands and clean up 
> after an event such as a removal, so what's the big deal about calling 
> scsi_done() to *tell* us that you've cleaned up?
> 
> > Now, here's the kicker -- this is what I think Linus wants:
> > 
> > Linus said to me, with a degree of annoyance, that he doesn't want
> > usb-storage to keep any associations of departed devices with SCSI emulated
> > hosts.  That means that I need to be able to add and remove hosts at the
> > will of the end-user.  In the end, what drives the entire process is what
> > Linus' hand does when it's placed on his USB flashcard reader.
> 
> Well, the few milliseconds it might take to properly clean this stuff out
> is well within the specifications of how fast a human can go around
> plugging and unplugging a flashcard reader, so I'm not concerned that my
> proposal doesn't meet these requirements.
> 
> 
> 
> -- 
>   Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
>          Red Hat, Inc. 
>          1801 Varsity Dr.
>          Raleigh, NC 27606
>   
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

Sir, for the hundreth time, we do NOT carry 600-round boxes of belt-fed 
suction darts!
					-- Salesperson to Greg
User Friendly, 12/30/1997

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 23:25                                                                         ` Matthew Dharm
@ 2003-01-25  0:05                                                                           ` Doug Ledford
  2003-01-25  0:45                                                                             ` Matthew Dharm
  2003-02-02  3:49                                                                             ` Matthew Dharm
  2003-01-25  1:24                                                                           ` Luben Tuikov
  1 sibling, 2 replies; 106+ messages in thread
From: Doug Ledford @ 2003-01-25  0:05 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

On Fri, Jan 24, 2003 at 03:25:40PM -0800, Matthew Dharm wrote:
> So, if I read this correctly, you're saying that the correct sequence is:
> 
> (1) get disconnect notification from USB
> (2) Call scsi_set_device_offline() (must hold host lock for this)

Yes.

> (3) call scsi_done() for all command in queue (max: 1)

Hmmm...only 1?  USB limit or driver limit?

> (4) Call scsi_remove_host(), which should now work because no commands are
> outstanding

We may need to add code to scsi_remove_host() to allow it to clean out the 
request queue of the device when the device is offline and this call is 
made.  Just because we returned the 1 command you had outstanding doesn't 
mean that there weren't more in the request queue (especially true of hard 
disks like my mp3 player).  However, once the device is offline, cleaning 
out the queue is a known non-blocking operation, it just takes non-0 time 
as well.  Once the queue is cleaned out, we need to shut it down so that 
no more commands can come in to the block level.

> (5) Call scsi_unregister()
> 
> And we're done, all structures can be freed.  And, as I understand it, the
> following is true:
> 
> (a) once (2) is done, no more commands will be queued

To your driver, yes.  If Mike makes it clean out and disable the request 
queue at the same time, then we could answer this question as yes at the 
request queue level as well.

> (b) once (3) is done, (4) is guaranteed to work

No!  Remember, command completion is delayed!  We have a tasklet that 
processes your now complete command, and with that processing comes 
marking the device unbusy, which is also required for 4 to work.  That's 
why I was suggesting waking up the error handler thread and letting it 
finish this process off.  The error handler thread has the luxury of being 
able to wait for the command completion to happen, and in my opinion it's 
a slightly better place to do the work of cleaning out the request queue.

> (c) there is nothing the user can do to make this sequence take a long time

True.  We need time to do things in our very slightly async way, but the 
user isn't able to keep us from completing.

> Tho, this does leave me with a couple of questions:
> 
> (i) Doesn't scsi_set_device_offline() work on devices, not hosts?  How do I
> map from my host to my device list?

Well, in hosts.c::scsi_remove_host() we do it thusly:

        list_for_each_entry(sdev, &shost->my_devices, siblings)
                if (scsi_check_device_busy(sdev))
                        return 1;

> (ii) Do I need to call scsi_set_device_offline() for each device?  I
> presume 'yes'.

Yes.  As people pointed out to me the reason a USB device is done as a 
host is because it very well may *be* a host with several devices behind 
it, so it must handle the multiple device scenario correctly and set all 
devices offline and clean up after all of them that might be behind this 
bridge.

> (iii) What should I shove into the status field of the scsi command before
> I scsi_done() it?

Well, to force an error I always put DID_ERROR into the driver byte of 
the result dword, aka:

cmd->result = DID_ERROR << 16;

> Oh, and as for my being a 'prick'.... my big problem is that the documented
> interface is synchronous.  Async is fine with me, but up until this e-mail,
> all I've seen is people arguing over what the sequence is, and theoretical
> issues of what users should and should not do.  And I also think that a
> large number of hotplugable hosts are going to replicate a whole bunch of
> code to do (2)+(3)+(4) in one, synchronous burst.

Which would be wrong BTW.  If you can support multiple devices behind a 
bridge then you can't put (2)+(3)+(4) together in one burst.  That's why 
they aren't that way now.  As to the sync vs. async, the scsi mid layer 
quit being fully sync during the 2.4 timeframe.  When the old error 
handling code was dropped from 2.5+, all sync completion code was also 
dropped.

> If someone will step forward with a 'yes' or 'no' on this sequence, then
> I'll get it done.  If the answer is 'no', then what did I miss?

Just the tasklet completion issue.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-25  0:05                                                                           ` Doug Ledford
@ 2003-01-25  0:45                                                                             ` Matthew Dharm
  2003-01-25  1:07                                                                               ` Doug Ledford
  2003-02-02  3:49                                                                             ` Matthew Dharm
  1 sibling, 1 reply; 106+ messages in thread
From: Matthew Dharm @ 2003-01-25  0:45 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 3227 bytes --]

Ah... the sweet feeling of progress.

On Fri, Jan 24, 2003 at 07:05:29PM -0500, Doug Ledford wrote:
> On Fri, Jan 24, 2003 at 03:25:40PM -0800, Matthew Dharm wrote:
> > So, if I read this correctly, you're saying that the correct sequence is:
> > 
> > (1) get disconnect notification from USB
> > (2) Call scsi_set_device_offline() (must hold host lock for this)

> > (3) call scsi_done() for all command in queue (max: 1)
> 
> Hmmm...only 1?  USB limit or driver limit?

Driver limit.  I added support for queueing, but the queue is fixed at size
1.  It's an improvement for the future.

> > (4) Call scsi_remove_host(), which should now work because no commands are
> > outstanding
> 
> > (5) Call scsi_unregister()
> > 
> > And we're done, all structures can be freed.  And, as I understand it, the
> > following is true:
> > 
> > (b) once (3) is done, (4) is guaranteed to work
> 
> No!  Remember, command completion is delayed!  We have a tasklet that 
> processes your now complete command, and with that processing comes 
> marking the device unbusy, which is also required for 4 to work.  That's 
> why I was suggesting waking up the error handler thread and letting it 
> finish this process off.  The error handler thread has the luxury of being 
> able to wait for the command completion to happen, and in my opinion it's 
> a slightly better place to do the work of cleaning out the request queue.

Okay... so what do I do if it fails?  Sleep for a while and try again
later?  Wait on a flag somewhere?

> > Tho, this does leave me with a couple of questions:
> > 
> > (i) Doesn't scsi_set_device_offline() work on devices, not hosts?  How do I
> > map from my host to my device list?
> 
> Well, in hosts.c::scsi_remove_host() we do it thusly:
> 
>         list_for_each_entry(sdev, &shost->my_devices, siblings)
>                 if (scsi_check_device_busy(sdev))
>                         return 1;

Right, perfect example.

> > (iii) What should I shove into the status field of the scsi command before
> > I scsi_done() it?
> 
> Well, to force an error I always put DID_ERROR into the driver byte of 
> the result dword, aka:
> 
> cmd->result = DID_ERROR << 16;

Sounds reasonable.

> > Async is fine with me, but up until this e-mail,
> > all I've seen is people arguing over what the sequence is, and theoretical
> > issues of what users should and should not do.  And I also think that a
> > large number of hotplugable hosts are going to replicate a whole bunch of
> > code to do (2)+(3)+(4) in one, synchronous burst.
> 
> Which would be wrong BTW.  If you can support multiple devices behind a 
> bridge then you can't put (2)+(3)+(4) together in one burst.  That's why 
> they aren't that way now.

Hrm... I can see your point if we're talking about hotplugging an
individual device, but I don't see how (2)+(3)+(4) isn't what we want for
hotplugging an entire host.

Matt

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

You are needink to look more evil.  You likink very strong coffee?
					-- Pitr to Dust Puppy
User Friendly, 10/16/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-25  0:45                                                                             ` Matthew Dharm
@ 2003-01-25  1:07                                                                               ` Doug Ledford
  2003-02-02 18:13                                                                                 ` Matthew Dharm
  0 siblings, 1 reply; 106+ messages in thread
From: Doug Ledford @ 2003-01-25  1:07 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

On Fri, Jan 24, 2003 at 04:45:53PM -0800, Matthew Dharm wrote:
> Ah... the sweet feeling of progress.

Indeed ;-)

> Driver limit.  I added support for queueing, but the queue is fixed at size
> 1.  It's an improvement for the future.

OK.  Just curious.

> > No!  Remember, command completion is delayed!  We have a tasklet that 
> > processes your now complete command, and with that processing comes 
> > marking the device unbusy, which is also required for 4 to work.  That's 
> > why I was suggesting waking up the error handler thread and letting it 
> > finish this process off.  The error handler thread has the luxury of being 
> > able to wait for the command completion to happen, and in my opinion it's 
> > a slightly better place to do the work of cleaning out the request queue.
> 
> Okay... so what do I do if it fails?  Sleep for a while and try again
> later?  Wait on a flag somewhere?

Well, the better option is what I think we are working on.  Instead of
trying to remove the host completely, just unhook it from the USB stuff,
then call the set_scsi_device_offline(), then send back any outstanding
commands via scsi_done(), then possibly call the
scsi_schedule_host_removal() if Mike adds that function.  Then return.  
Don't to anything else.  Take all the remaining code you would normally
run at this point and put it into a function in your source called
usb_release() and in your Scsi_Host_Template struct that you pass to the
scsi layer, init the release pointer with the address of your
usb_release() routine.  That way, the scsi layer can do what it's best at,
taking care of the clean up details we've been talking about, and when
it's all done, it will call your usb_release() routine with a single
argument of the host struct you are wanting released.  At that point, you
can do all the freeing you would have done in that khubd loop (at least I
think that's the context you are doing the freeing from now) and know for
a fact that not only are you freeing everything, but so is the scsi mid
layer.  I think this will solve all the issues you've had, because this
*won't* leak, it won't block your other actions, and it lets the scsi
subsystem clean up properly.

> > > Async is fine with me, but up until this e-mail,
> > > all I've seen is people arguing over what the sequence is, and theoretical
> > > issues of what users should and should not do.  And I also think that a
> > > large number of hotplugable hosts are going to replicate a whole bunch of
> > > code to do (2)+(3)+(4) in one, synchronous burst.
> > 
> > Which would be wrong BTW.  If you can support multiple devices behind a 
> > bridge then you can't put (2)+(3)+(4) together in one burst.  That's why 
> > they aren't that way now.
> 
> Hrm... I can see your point if we're talking about hotplugging an
> individual device, but I don't see how (2)+(3)+(4) isn't what we want for
> hotplugging an entire host.

The numbers are gone from the email now so it's hard to reference, but I 
think I was commenting on the fact that if you have a true host device, 
then you might well be doing (2)+(3)*(number of devices behind 
bridge)+(4) or something like that.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-25  1:07                                                                               ` Doug Ledford
@ 2003-02-02 18:13                                                                                 ` Matthew Dharm
  2003-02-02 20:06                                                                                   ` Matthew Dharm
  0 siblings, 1 reply; 106+ messages in thread
From: Matthew Dharm @ 2003-02-02 18:13 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 2129 bytes --]

So, was any of this ever implemented?  As far as I can tell, the required
changes were:

(o) addition of scsi_schedule_host_removal() (possibly optional)
(o) implementation of scsi_set_device_offline() (possibly optional)
(o) change the behavior of the 'hotplug initialization model' to call my
release function

Matt

On Fri, Jan 24, 2003 at 08:07:29PM -0500, Doug Ledford wrote:
> On Fri, Jan 24, 2003 at 04:45:53PM -0800, Matthew Dharm wrote:
> > Okay... so what do I do if it fails?  Sleep for a while and try again
> > later?  Wait on a flag somewhere?
> 
> Well, the better option is what I think we are working on.  Instead of
> trying to remove the host completely, just unhook it from the USB stuff,
> then call the set_scsi_device_offline(), then send back any outstanding
> commands via scsi_done(), then possibly call the
> scsi_schedule_host_removal() if Mike adds that function.  Then return.  
> Don't to anything else.  Take all the remaining code you would normally
> run at this point and put it into a function in your source called
> usb_release() and in your Scsi_Host_Template struct that you pass to the
> scsi layer, init the release pointer with the address of your
> usb_release() routine.  That way, the scsi layer can do what it's best at,
> taking care of the clean up details we've been talking about, and when
> it's all done, it will call your usb_release() routine with a single
> argument of the host struct you are wanting released.  At that point, you
> can do all the freeing you would have done in that khubd loop (at least I
> think that's the context you are doing the freeing from now) and know for
> a fact that not only are you freeing everything, but so is the scsi mid
> layer.  I think this will solve all the issues you've had, because this
> *won't* leak, it won't block your other actions, and it lets the scsi
> subsystem clean up properly.

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

We can customize our colonels.
					-- Tux
User Friendly, 12/1/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-02-02 18:13                                                                                 ` Matthew Dharm
@ 2003-02-02 20:06                                                                                   ` Matthew Dharm
  2003-02-03 17:17                                                                                     ` Mike Anderson
  0 siblings, 1 reply; 106+ messages in thread
From: Matthew Dharm @ 2003-02-02 20:06 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 2713 bytes --]

Willem Riede <wrlk@riede.org> suggested to me that I simply set
sdev->online = 0 for scsi_set_device_offline()

Any reason that isn't good enough?

On Sun, Feb 02, 2003 at 10:13:17AM -0800, Matthew Dharm wrote:
> So, was any of this ever implemented?  As far as I can tell, the required
> changes were:
> 
> (o) addition of scsi_schedule_host_removal() (possibly optional)
> (o) implementation of scsi_set_device_offline() (possibly optional)
> (o) change the behavior of the 'hotplug initialization model' to call my
> release function
> 
> Matt
> 
> On Fri, Jan 24, 2003 at 08:07:29PM -0500, Doug Ledford wrote:
> > On Fri, Jan 24, 2003 at 04:45:53PM -0800, Matthew Dharm wrote:
> > > Okay... so what do I do if it fails?  Sleep for a while and try again
> > > later?  Wait on a flag somewhere?
> > 
> > Well, the better option is what I think we are working on.  Instead of
> > trying to remove the host completely, just unhook it from the USB stuff,
> > then call the set_scsi_device_offline(), then send back any outstanding
> > commands via scsi_done(), then possibly call the
> > scsi_schedule_host_removal() if Mike adds that function.  Then return.  
> > Don't to anything else.  Take all the remaining code you would normally
> > run at this point and put it into a function in your source called
> > usb_release() and in your Scsi_Host_Template struct that you pass to the
> > scsi layer, init the release pointer with the address of your
> > usb_release() routine.  That way, the scsi layer can do what it's best at,
> > taking care of the clean up details we've been talking about, and when
> > it's all done, it will call your usb_release() routine with a single
> > argument of the host struct you are wanting released.  At that point, you
> > can do all the freeing you would have done in that khubd loop (at least I
> > think that's the context you are doing the freeing from now) and know for
> > a fact that not only are you freeing everything, but so is the scsi mid
> > layer.  I think this will solve all the issues you've had, because this
> > *won't* leak, it won't block your other actions, and it lets the scsi
> > subsystem clean up properly.
> 
> -- 
> Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
> Maintainer, Linux USB Mass Storage Driver
> 
> We can customize our colonels.
> 					-- Tux
> User Friendly, 12/1/1998



-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

Sir, for the hundreth time, we do NOT carry 600-round boxes of belt-fed 
suction darts!
					-- Salesperson to Greg
User Friendly, 12/30/1997

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-02-02 20:06                                                                                   ` Matthew Dharm
@ 2003-02-03 17:17                                                                                     ` Mike Anderson
  2003-02-16 21:18                                                                                       ` Matthew Dharm
  0 siblings, 1 reply; 106+ messages in thread
From: Mike Anderson @ 2003-02-03 17:17 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

Sorry Matthew I got side tracked on some issues for the last week. The
scsi_set_device_offline(); function has not been added to any of James
linux-scsi trees. You could add a ifndef in your code until we get the
interface in the tree. 

Would scsi_set_device_offline() do more than sdev->online = 0? It
depends on the state of the device at the calling of the function.

I currently have scsi_set_device_offline() trying to do the following:
	1.) set device offline and mark host in_recovery.
	2.) mark all outstanding commands to be canceled and wake up
	error handler.
	3.) flush the request queue.
	4.) Once device is really offline send hotplug event.

(2) needs some changes in the error handler which are needed in other
cases. This the large part of the change, but it is not directly part of
your request.

(3) need a cleaner way to flush request specials off the queue . It
would also be nice if there was a method to stop the incoming side of the
request queue.

(4) need export of do_hotplug interface or a method to generate a call
to it for an offline event.

I am working on these changes pretty much in order.

Doug suggested a scsi_schedule_host_removal(), but I thought we could
just change scsi_remove_host() to handle this task unless there is a
side effect that all callers would not want???.

-andmike

Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote:
> Willem Riede <wrlk@riede.org> suggested to me that I simply set
> sdev->online = 0 for scsi_set_device_offline()
> 
> Any reason that isn't good enough?
> 
> On Sun, Feb 02, 2003 at 10:13:17AM -0800, Matthew Dharm wrote:
> > So, was any of this ever implemented?  As far as I can tell, the required
> > changes were:
> > 
> > (o) addition of scsi_schedule_host_removal() (possibly optional)
> > (o) implementation of scsi_set_device_offline() (possibly optional)
> > (o) change the behavior of the 'hotplug initialization model' to call my
> > release function
> > 
> > Matt
> > > then call the set_scsi_device_offline(), then send back any outstanding
> > > commands via scsi_done(), then possibly call the
> > > scsi_schedule_host_removal() if Mike adds that function.  Then return.  
> > > Don't to anything else.  Take all the remaining code you would normally
> > > run at this point and put it into a function in your source called
> > > usb_release() and in your Scsi_Host_Template struct that you pass to the
> > > scsi layer, init the release pointer with the address of your
> > > usb_release() routine.  That way, the scsi layer can do what it's best at,
> > > taking care of the clean up details we've been talking about, and when
> > > it's all done, it will call your usb_release() routine with a single
> > > argument of the host struct you are wanting released.  At that point, you
> > > can do all the freeing you would have done in that khubd loop (at least I
> > > think that's the context you are doing the freeing from now) and know for
> > > a fact that not only are you freeing everything, but so is the scsi mid
> > > layer.  I think this will solve all the issues you've had, because this
> > > *won't* leak, it won't block your other actions, and it lets the scsi
> > > subsystem clean up properly.


-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-02-03 17:17                                                                                     ` Mike Anderson
@ 2003-02-16 21:18                                                                                       ` Matthew Dharm
  2003-02-17 19:37                                                                                         ` Mike Anderson
  0 siblings, 1 reply; 106+ messages in thread
From: Matthew Dharm @ 2003-02-16 21:18 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 4225 bytes --]

Any updates on this?  I saw some patches, but they don't seem to be in my
tree (the usb tree, which is synced from Linus' tree).

People are starting to reports OOPSes to me because of this being
missing....

Matt

On Mon, Feb 03, 2003 at 09:17:26AM -0800, Mike Anderson wrote:
> Sorry Matthew I got side tracked on some issues for the last week. The
> scsi_set_device_offline(); function has not been added to any of James
> linux-scsi trees. You could add a ifndef in your code until we get the
> interface in the tree. 
> 
> Would scsi_set_device_offline() do more than sdev->online = 0? It
> depends on the state of the device at the calling of the function.
> 
> I currently have scsi_set_device_offline() trying to do the following:
> 	1.) set device offline and mark host in_recovery.
> 	2.) mark all outstanding commands to be canceled and wake up
> 	error handler.
> 	3.) flush the request queue.
> 	4.) Once device is really offline send hotplug event.
> 
> (2) needs some changes in the error handler which are needed in other
> cases. This the large part of the change, but it is not directly part of
> your request.
> 
> (3) need a cleaner way to flush request specials off the queue . It
> would also be nice if there was a method to stop the incoming side of the
> request queue.
> 
> (4) need export of do_hotplug interface or a method to generate a call
> to it for an offline event.
> 
> I am working on these changes pretty much in order.
> 
> Doug suggested a scsi_schedule_host_removal(), but I thought we could
> just change scsi_remove_host() to handle this task unless there is a
> side effect that all callers would not want???.
> 
> -andmike
> 
> Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote:
> > Willem Riede <wrlk@riede.org> suggested to me that I simply set
> > sdev->online = 0 for scsi_set_device_offline()
> > 
> > Any reason that isn't good enough?
> > 
> > On Sun, Feb 02, 2003 at 10:13:17AM -0800, Matthew Dharm wrote:
> > > So, was any of this ever implemented?  As far as I can tell, the required
> > > changes were:
> > > 
> > > (o) addition of scsi_schedule_host_removal() (possibly optional)
> > > (o) implementation of scsi_set_device_offline() (possibly optional)
> > > (o) change the behavior of the 'hotplug initialization model' to call my
> > > release function
> > > 
> > > Matt
> > > > then call the set_scsi_device_offline(), then send back any outstanding
> > > > commands via scsi_done(), then possibly call the
> > > > scsi_schedule_host_removal() if Mike adds that function.  Then return.  
> > > > Don't to anything else.  Take all the remaining code you would normally
> > > > run at this point and put it into a function in your source called
> > > > usb_release() and in your Scsi_Host_Template struct that you pass to the
> > > > scsi layer, init the release pointer with the address of your
> > > > usb_release() routine.  That way, the scsi layer can do what it's best at,
> > > > taking care of the clean up details we've been talking about, and when
> > > > it's all done, it will call your usb_release() routine with a single
> > > > argument of the host struct you are wanting released.  At that point, you
> > > > can do all the freeing you would have done in that khubd loop (at least I
> > > > think that's the context you are doing the freeing from now) and know for
> > > > a fact that not only are you freeing everything, but so is the scsi mid
> > > > layer.  I think this will solve all the issues you've had, because this
> > > > *won't* leak, it won't block your other actions, and it lets the scsi
> > > > subsystem clean up properly.
> 
> 
> -andmike
> --
> Michael Anderson
> andmike@us.ibm.com
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

My mother not mind to die for stoppink Windows NT!  She is rememberink 
Stalin!
					-- Pitr
User Friendly, 9/6/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-02-16 21:18                                                                                       ` Matthew Dharm
@ 2003-02-17 19:37                                                                                         ` Mike Anderson
  2003-02-17 19:51                                                                                           ` Patrick Mansfield
  2003-02-23  7:48                                                                                           ` Matthew Dharm
  0 siblings, 2 replies; 106+ messages in thread
From: Mike Anderson @ 2003-02-17 19:37 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote:
> Any updates on this?  I saw some patches, but they don't seem to be in my
> tree (the usb tree, which is synced from Linus' tree).
> 
> People are starting to reports OOPSes to me because of this being
> missing....
> 
> Matt
> 

The scsi_set_device_offline interface is part of the last patch (scsi
error) I sent to linux-scsi. I updated my patch post some comments from
the list, but I am working on issue with the patch before I resend.

-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-02-17 19:37                                                                                         ` Mike Anderson
@ 2003-02-17 19:51                                                                                           ` Patrick Mansfield
  2003-02-23  7:48                                                                                           ` Matthew Dharm
  1 sibling, 0 replies; 106+ messages in thread
From: Patrick Mansfield @ 2003-02-17 19:51 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

On Mon, Feb 17, 2003 at 11:37:37AM -0800, Mike Anderson wrote:
> 
> The scsi_set_device_offline interface is part of the last patch (scsi
> error) I sent to linux-scsi. I updated my patch post some comments from
> the list, but I am working on issue with the patch before I resend.
> 
> -andmike
> --
> Michael Anderson
> andmike@us.ibm.com

One point with the interface - can we have a bit higher level interface
based on what has happened to the adapter, so that if scsi wants to behave
differently in the future the interface need not change?

That is, have an exported scsi_host_removed(struct scsi_host *shost) versus
a scsi_set_device_offline(struct scsi_device *sdev)?

scsi_host_removed can offline all sdev's etc. for now, and if we ever want
to change data structure layouts or the behaviour in the future (i.e. allow
surprise removal/replacement of the same storage without forcing removal
of a scsi_device) the interface need not change.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-02-17 19:37                                                                                         ` Mike Anderson
  2003-02-17 19:51                                                                                           ` Patrick Mansfield
@ 2003-02-23  7:48                                                                                           ` Matthew Dharm
  2003-02-26 23:37                                                                                             ` Mike Anderson
  1 sibling, 1 reply; 106+ messages in thread
From: Matthew Dharm @ 2003-02-23  7:48 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 1774 bytes --]

Okay, I see Linus has now accepted this into his tree.  It should propagate
to the USB development trees soon.

One question: What else is needed?  We set the device offline,
error/complete all pending commands, and the need to (somehow) make certain
we're in a good state for calling scsi_remove_host().  How do we make that
final guarantee?

There was talk that scsi_set_device_offline() would take care of that for
us by waking up the error handler... there seems to be code to do that....

There was talk of using the release() function from the SCSI template to
actually release resources....

So, what's the plan?

Matt


On Mon, Feb 17, 2003 at 11:37:37AM -0800, Mike Anderson wrote:
> Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote:
> > Any updates on this?  I saw some patches, but they don't seem to be in my
> > tree (the usb tree, which is synced from Linus' tree).
> > 
> > People are starting to reports OOPSes to me because of this being
> > missing....
> > 
> > Matt
> > 
> 
> The scsi_set_device_offline interface is part of the last patch (scsi
> error) I sent to linux-scsi. I updated my patch post some comments from
> the list, but I am working on issue with the patch before I resend.
> 
> -andmike
> --
> Michael Anderson
> andmike@us.ibm.com
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

But where are the THEMES?!  How do you expect me to use an OS without 
themes?!
					-- Stef
User Friendly, 10/9/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-02-23  7:48                                                                                           ` Matthew Dharm
@ 2003-02-26 23:37                                                                                             ` Mike Anderson
  2003-02-27  1:10                                                                                               ` Matthew Dharm
  0 siblings, 1 reply; 106+ messages in thread
From: Mike Anderson @ 2003-02-26 23:37 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

Matthew,
	Sorry for the delay in replying (non coding activities are
	consuming to many hours).

Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote:
> Okay, I see Linus has now accepted this into his tree.  It should propagate
> to the USB development trees soon.
> 
> One question: What else is needed?  We set the device offline,
> error/complete all pending commands, and the need to (somehow) make certain
> we're in a good state for calling scsi_remove_host().  How do we make that
> final guarantee?
> 
> There was talk that scsi_set_device_offline() would take care of that for
> us by waking up the error handler... there seems to be code to do that....
> 

Yes,

The scsi_set_device_offline will wake up the error handler to abort
outstanding commands.

> There was talk of using the release() function from the SCSI template to
> actually release resources....
> 
> So, what's the plan?

There still are a few things on the to do list, but should not effect the
LLDD interface (at least this is the goal).
	- scsi_request_fn needs a fix for device offline that will
	  handle all request types.
	- scsi_remove_host needs to call template release at the correct
	  time (ref counting ??).
	- need fix for offline hotplug event.
		- Should do_hotplug be exported or should device states
		  be added / fixed ??

Cleanups
	- Change scsi_remove_host for int to void function.


-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-02-26 23:37                                                                                             ` Mike Anderson
@ 2003-02-27  1:10                                                                                               ` Matthew Dharm
  2003-02-27  6:37                                                                                                 ` Mike Anderson
  0 siblings, 1 reply; 106+ messages in thread
From: Matthew Dharm @ 2003-02-27  1:10 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 1015 bytes --]

On Wed, Feb 26, 2003 at 03:37:02PM -0800, Mike Anderson wrote:
> There still are a few things on the to do list, but should not effect the
> LLDD interface (at least this is the goal).
> 	- scsi_request_fn needs a fix for device offline that will
> 	  handle all request types.
> 	- scsi_remove_host needs to call template release at the correct
> 	  time (ref counting ??).
> 	- need fix for offline hotplug event.
> 		- Should do_hotplug be exported or should device states
> 		  be added / fixed ??

Right... but I removed the release() function because that was marked (in
the documentation) as only for the old-style drivers.  So I'll need to
re-introduce it -- but it looks like all it has to do is free some memory.
Does that sound about right?

Matt

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

Da.  Am thinkink of carbonated borscht for lonk nights of coding.
					-- Pitr
User Friendly, 7/24/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-02-27  1:10                                                                                               ` Matthew Dharm
@ 2003-02-27  6:37                                                                                                 ` Mike Anderson
  2003-02-27 19:32                                                                                                   ` Matthew Dharm
  0 siblings, 1 reply; 106+ messages in thread
From: Mike Anderson @ 2003-02-27  6:37 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote:
> Right... but I removed the release() function because that was marked (in
> the documentation) as only for the old-style drivers.  So I'll need to
> re-introduce it -- but it looks like all it has to do is free some memory.
> Does that sound about right?

Yes it was previously removed, but IIRC this is the direction
discussed on this thread. For a idle device the release functionality could
be done in the context of the scsi_remove_host call, but for a busy
device we need to have this call to clean up later.

-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-02-27  6:37                                                                                                 ` Mike Anderson
@ 2003-02-27 19:32                                                                                                   ` Matthew Dharm
  2003-03-01  1:41                                                                                                     ` Matthew Dharm
  0 siblings, 1 reply; 106+ messages in thread
From: Matthew Dharm @ 2003-02-27 19:32 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH,
	linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 1636 bytes --]

This was discussed, but I didn't recall a firm decision.

I'll keep my eyes open for the patch that uses release().

Matt

On Wed, Feb 26, 2003 at 10:37:37PM -0800, Mike Anderson wrote:
> Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote:
> > Right... but I removed the release() function because that was marked (in
> > the documentation) as only for the old-style drivers.  So I'll need to
> > re-introduce it -- but it looks like all it has to do is free some memory.
> > Does that sound about right?
> 
> Yes it was previously removed, but IIRC this is the direction
> discussed on this thread. For a idle device the release functionality could
> be done in the context of the scsi_remove_host call, but for a busy
> device we need to have this call to clean up later.
> 
> -andmike
> --
> Michael Anderson
> andmike@us.ibm.com
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: Scholarships for Techies!
> Can't afford IT training? All 2003 ictp students receive scholarships.
> Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more.
> www.ictp.com/training/sourceforge.asp
> _______________________________________________
> linux-usb-devel@lists.sourceforge.net
> To unsubscribe, use the last form field at:
> https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

C:  They kicked your ass, didn't they?
S:  They were cheating!
					-- The Chief and Stef
User Friendly, 11/19/1997

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-02-27 19:32                                                                                                   ` Matthew Dharm
@ 2003-03-01  1:41                                                                                                     ` Matthew Dharm
  0 siblings, 0 replies; 106+ messages in thread
From: Matthew Dharm @ 2003-03-01  1:41 UTC (permalink / raw)
  To: Mike Anderson, Oliver Neukum, Luben Tuikov, Alan Stern,
	David Brownell, Greg KH, linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 2282 bytes --]

After conversing with Mike some more, there is a deadlock problem.  I need
to hold the host_lock to walk the device list, but I can't hold the
host_lock when I call scsi_set_device_offline().

Mike's looking into this.  Until he finds a good answer, no patch for
usb-storage.

Matt

On Thu, Feb 27, 2003 at 11:32:39AM -0800, Matthew Dharm wrote:
> This was discussed, but I didn't recall a firm decision.
> 
> I'll keep my eyes open for the patch that uses release().
> 
> Matt
> 
> On Wed, Feb 26, 2003 at 10:37:37PM -0800, Mike Anderson wrote:
> > Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote:
> > > Right... but I removed the release() function because that was marked (in
> > > the documentation) as only for the old-style drivers.  So I'll need to
> > > re-introduce it -- but it looks like all it has to do is free some memory.
> > > Does that sound about right?
> > 
> > Yes it was previously removed, but IIRC this is the direction
> > discussed on this thread. For a idle device the release functionality could
> > be done in the context of the scsi_remove_host call, but for a busy
> > device we need to have this call to clean up later.
> > 
> > -andmike
> > --
> > Michael Anderson
> > andmike@us.ibm.com
> > 
> > 
> > 
> > -------------------------------------------------------
> > This SF.net email is sponsored by: Scholarships for Techies!
> > Can't afford IT training? All 2003 ictp students receive scholarships.
> > Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more.
> > www.ictp.com/training/sourceforge.asp
> > _______________________________________________
> > linux-usb-devel@lists.sourceforge.net
> > To unsubscribe, use the last form field at:
> > https://lists.sourceforge.net/lists/listinfo/linux-usb-devel
> 
> -- 
> Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
> Maintainer, Linux USB Mass Storage Driver
> 
> C:  They kicked your ass, didn't they?
> S:  They were cheating!
> 					-- The Chief and Stef
> User Friendly, 11/19/1997



-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

It was a new hope.
					-- Dust Puppy
User Friendly, 12/25/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-25  0:05                                                                           ` Doug Ledford
  2003-01-25  0:45                                                                             ` Matthew Dharm
@ 2003-02-02  3:49                                                                             ` Matthew Dharm
  1 sibling, 0 replies; 106+ messages in thread
From: Matthew Dharm @ 2003-02-02  3:49 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 1310 bytes --]

So, I was trying to implement this bit of logic in usb-storage, when I
discovered that scsi_set_device_offline() doesn't exist.

Okay, what did I miss?  I'm sure it's obvious, but for the life of me I
don't see it.

Matt

On Fri, Jan 24, 2003 at 07:05:29PM -0500, Doug Ledford wrote:
> > (i) Doesn't scsi_set_device_offline() work on devices, not hosts?  How do I
> > map from my host to my device list?
> 
> Well, in hosts.c::scsi_remove_host() we do it thusly:
> 
>         list_for_each_entry(sdev, &shost->my_devices, siblings)
>                 if (scsi_check_device_busy(sdev))
>                         return 1;
> 
> > (ii) Do I need to call scsi_set_device_offline() for each device?  I
> > presume 'yes'.
> 
> Yes.  As people pointed out to me the reason a USB device is done as a 
> host is because it very well may *be* a host with several devices behind 
> it, so it must handle the multiple device scenario correctly and set all 
> devices offline and clean up after all of them that might be behind this 
> bridge.

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

I'm seen in many forms.  Now open your mouth.  It's caffeine time.
					-- Cola Man to Greg
User Friendly, 10/28/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-24 23:25                                                                         ` Matthew Dharm
  2003-01-25  0:05                                                                           ` Doug Ledford
@ 2003-01-25  1:24                                                                           ` Luben Tuikov
  1 sibling, 0 replies; 106+ messages in thread
From: Luben Tuikov @ 2003-01-25  1:24 UTC (permalink / raw)
  To: Matthew Dharm
  Cc: Oliver Neukum, Alan Stern, David Brownell, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Matthew Dharm wrote:
> So, if I read this correctly, you're saying that the correct sequence is:
> 
> (1) get disconnect notification from USB
> (2) Call scsi_set_device_offline() (must hold host lock for this)
> (3) call scsi_done() for all command in queue (max: 1)

Right.  LLDD does (2) and (3), though, *and let's decide on this scsi ppl*
I'd rather (3) be initiated by SCSI Core, e.g. in recovery code.

> (4) Call scsi_remove_host(), which should now work because no commands are
> outstanding
> (5) Call scsi_unregister()

This is tricky.  I'd rather SCSI Core do those things.  I.e. when there
are no devices left, then SCSI Core could probably initiate the removal
of the host, just as it does for devices with slave_alloc().  Please
ppl, read Doug's original email *carefullly*, especially those who are
implementing this now (Mike?).

(You may decide to go over your queue and error them out, but the
error recovery thread should call your eh_abort() method -- this
is a more consistent way of doing this and you'd have *less*
code duplication in LLDD.)

The whole point is that when shared data is involved, i.e. hosts,
commands, devices, LLDD can tell SCSI Core what it (LLDD) wants
to be done eventually, (by, say telling scsi core about the unplug
event by calling scsi_set_device_offline()), and then when SCSI Core
decides that it is safe to do so, it will *remove* the device(s)
and/(or) host(s), the former by calling slave_destroy(), and the
latter by the release() method.

Please note, that a LLDD cannot ``run'' SCSI Core, and this is what
you've been inclined to do in all your emails.

For this I'm inclined to include in SHT/host, a host_volatile:1 flag
to mean whether the host is to be removed when there's no devices;
1 to mean that it is to be removed when # dev = 0, 0 to mean that
the host stays.

Some hosts may decide to stay even if there's no devices attached,
e.g. FC/SAN hosts. (since a device may come up any moment now, or
that the host has been authenticated and connected to a target
whos luns got pulled out at this moment in time -- and there's no
point in removing the host and then initiating a whole new
session/authenication/etc.)

USB Storage hosts will have .host_volatile = 1.

> And we're done, all structures can be freed.

SCSI Core will tell you when you're done, when your
release() method is called -- when SCSI Core decides
to do it.

>  And, as I understand it, the
> following is true:
> 
> (a) once (2) is done, no more commands will be queued

Ok, SCSI Core can take care of this.

> (b) once (3) is done, (4) is guaranteed to work

You shouldn't care of this -- your driver methods will get called.
You shouldn't know about how the layer above you implements it -- this
is the whole point of separating SCSI Core and LLDD as different
subsystems.

> (c) there is nothing the user can do to make this sequence take a long time

Right.

> Tho, this does leave me with a couple of questions:
> 
> (i) Doesn't scsi_set_device_offline() work on devices, not hosts?  How do I
> map from my host to my device list?

Doug answered this -- but it's even easier when you have one device per host.

> (ii) Do I need to call scsi_set_device_offline() for each device?  I
> presume 'yes'.

Mostly it would depend on USB.  Is it a normal host with several real
devices and just one of them is going away?  Or is it a bridge itself
which is going away and you must force unplug all device? I can imagine
this either way, though my knowledge of USB is limited. :-)

SCSI ppl: the transport may not be USB so let's not generalize if new
code is going into SCSI Core.

> (iii) What should I shove into the status field of the scsi command before
> I scsi_done() it?

DID_ERROR or DID_BAD_TARGET, since the device is gone.

> Oh, and as for my being a 'prick'.... my big problem is that the documented
> interface is synchronous.  Async is fine with me, but up until this e-mail,
> all I've seen is people arguing over what the sequence is, and theoretical
> issues of what users should and should not do.

Most of the things said were of interest and concern to SCSI ppl too.

>  And I also think that a
> large number of hotplugable hosts are going to replicate a whole bunch of
> code to do (2)+(3)+(4) in one, synchronous burst.

No, not quite -- those will be called by SCSI Core, depending on the
host.  Things may work differently for a host connected to a SAN, don't
you think?

For this reason we cannot lump up (2), (3) and (4).  They'll be separated
and SCSI Core will drive things up.

-- 
Luben

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 20:28                                                             ` Doug Ledford
  2003-01-23 20:59                                                               ` Oliver Neukum
@ 2003-01-24  0:15                                                               ` Patrick Mansfield
  2003-01-24  8:33                                                               ` David Brownell
  2 siblings, 0 replies; 106+ messages in thread
From: Patrick Mansfield @ 2003-01-24  0:15 UTC (permalink / raw)
  To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel,
	Linux SCSI list

On Thu, Jan 23, 2003 at 03:28:36PM -0500, Doug Ledford wrote:
> On Thu, Jan 23, 2003 at 08:40:41PM +0100, Oliver Neukum wrote:
> 
> No, scsi_set_device_offline() schedules the error handler thread for that 
> host to be woken up.
> 

Doug -

Why would the error handler need to run?

If the LLDD fails all outstanding command with an appropriate error (like
DID_NO_CONNECT), the failure is passed to the upper levels.

It seems that we could just set the device offline, make sure we do not
send commands for an offline device, and let the adapter fail all
outstanding commands.

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-23 20:28                                                             ` Doug Ledford
  2003-01-23 20:59                                                               ` Oliver Neukum
  2003-01-24  0:15                                                               ` Patrick Mansfield
@ 2003-01-24  8:33                                                               ` David Brownell
  2 siblings, 0 replies; 106+ messages in thread
From: David Brownell @ 2003-01-24  8:33 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Oliver Neukum, Luben Tuikov, Alan Stern, Matthew Dharm,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

Doug Ledford wrote:

> Once all the commands are gone and no more are arriving, then if, and only 
> if, someone actually removes the device from the scsi subsystem (maybe 
> hotplug manager or something) then you will get the typical 
> slave_destroy() call to tell you that it is safe to release all resources 
> related to this device.  Otherwise, the device will hang around as an 
> offline device until someone does  

Does that mean there's no LLD state for that HBA (and/or devices
connected to it), so that some _other_ kind of state is representing
these zombie devices?

Seems like it must.  USB physical device model state must go away
when the device does, rather promptly ... the disconnect() is invoked
in a thread, so it can block for a very short while.   Blocking more
than a few milliseconds there is extremely antisocial though, so
that can't be the device state that will "hang around" until some user
mode agent does your suggested removal action.

I get the impression these zombie devices are largely what Linus has
recently asked to be removed from usb-storage.  Which has, so far,
been quite keen to re-animate them ... maybe a useful incremental
improvement would be just to give up the re-animation part, in such
a way that the scsi a/b/c/d identifiers eventually get recycled.

>   echo "scsi-remove-single-device a b c d" >/proc/scsi/scsi
> to remove it.

That ought to be trivial for some hotplug agent to do ... but there's
the issue of where "a b c d" come from.  Last I looked, there was no
obvious way to associate such data with the SCSI hotplug events; the
sysfs state wasn't very helpful.

- Dave

^ permalink raw reply	[flat|nested] 106+ messages in thread

* A different look at block device hotswap in the Linux kernel
  2003-01-23 18:19                                                       ` Oliver Neukum
  2003-01-23 19:07                                                         ` Luben Tuikov
@ 2003-01-23 20:41                                                         ` Steven Dake
  2003-01-23 21:07                                                           ` Matthew Jacob
  2003-01-24  0:07                                                           ` Oliver Neukum
  1 sibling, 2 replies; 106+ messages in thread
From: Steven Dake @ 2003-01-23 20:41 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

Oliver and others,

In regards to hotswap, any real operating system should be _told_ that a 
block device is going to be removed from the top.  There are several 
reasons.

1) File mounts should be removed from the filesystem layer
2) files accessing block devices directly should be terminated
3) raid members using that block device should be hot removed
4) I'm sure you can think of others :)

The key is that the removal request should come from the top, not the 
bottom.  If someone is stupid enough to surprise remove a device (ie: 
unplug their USB SCSI device while the device is in use by the OS), they 
get what they deserve (I/O errors, dirty OS data, queued up requests 
which never shut down).  If they tell the OS that the device is going to 
be removed, so it may flush the device and shut down I/O to the device, 
the request should be granted on all accounts (expected removal).

The device driver should not be responsible for managing hotswap in any 
regard.  Its only purpose should be to tell the block device removal 
layer that a surprise extraction was initiated such that the block 
device removal code can ask the mid layer drivers to shut down error 
correction routines to the device and dump its pending I/O queue and 
clean up after the device.  The main advantage of this technique is 
simplicity (the LLDD's don't have to have repetative logic for each 
device driver), genericity (the block device removal code can be 
maintained in one place and be guaranteed to ensure the OS is in a 
stable state after a device is removed either surprise or expected and 
finally it solves the in-flight I/O problem by stopping new I/O to the 
device, shutting down I/O to the device, flushing the pending I/O 
queues, and killing all references in the OS of the device.

If you think about what your suggesting, your suggesting that the LLDD 
tells the scsi layer that the device is gone, that then times out errors 
and leaves the filesystem and sys_open/close file tables, and RAID 
layers in a state of disarray.  We don't want the LLDD knowing about the 
RAID system and whether it should tell the RAID layer to hot remove, do we?

I've developed code to do exactly what I have described here (surprise 
and expected extractions genericized into one file with one simple call 
from userland and a method for lower layers to indicate a surprise 
extraction if they have detected one.  I'll post as soon as I have time 
to make a patch against 2.5 .

Thanks
-steve

Oliver Neukum wrote:

>Am Donnerstag, 23. Januar 2003 18:46 schrieb Luben Tuikov:
>  
>
>>Oliver Neukum wrote:
>>    
>>
>>>Not all the world is a SAN. USB has no possibility to even try an
>>>interaction after the device is gone. We have to handle this flexibly.
>>>      
>>>
>>Thus the example in the original post.  I.e. for simple transports whose
>>portals get notified when a device is plugged off (USB), the LLDD
>>can notify SCSI Core, by setting a state variable in scsi_device.
>>In which case SCSI Core can answer with the proper TARGET error code.
>>(This was outlined before, scsi_command->online:1 ...)
>>    
>>
>
>Very well, so you agree that the SCSI layer should export to the LLDD
>a function to set devices offline?
>
>  
>
>>>In fact, if a device
>>>can vanish without a LLDD knowing about it, this is purely a problem of
>>>the SCSI layer.
>>>      
>>>
>>No, of course not.  (Think of IP.)  When a device vanishes and LLDD doesn't
>>know about it (more complicated transports), the CDB will return with
>>the proper Service Response, since the transport(s) won't be able to
>>deliver it. This will bubble up through SCSI Core and the error returned
>>will have to be the same as that of the simpler transports, as outlined
>>above.
>>    
>>
>
>Yes, sorry. To be precise, this means that the LLDD has to do nothing
>special, as it has to implement checking for a failing command anyway.
>But it's not entirely the same. If a command cannot be delivered it may or may
>not be appropriate to start error recovery. After the LLDD has told
>the SCSI layer that it has noticed a device going away, there must be no
>error recovery.
>
>  
>
>>>That means that we have to have a way to ensure that no more commands
>>>will reach the LLDD which can be triggered without any commands to be
>>>executed at all. This functionality has to come from the scsi mid layer.
>>>      
>>>
>>For simple transports yes; for more complicated ones, the CDB will
>>not be able to be delivered, and will return with error.
>>    
>>
>
>Good.
>So the first thing a LLDD has to do after it has learned about a device
>being removed is to have the device block.
>1. set device offline
>But commands may still be in flight.IMHO it is not right to assume that
>all commands now in flight to a device have failed, as some may have
>completed successfully in time, or failed for other reasons than unplugging.
>So it should be the LLDD's responsibility to finish the outstanding commands.
>Furthermore, there's a window for commands already having passed the check
>for offline but not yet being noticed by the LLDD. The simplest solution is to
>use a waiting primitive from RCU. So we are at:
>
>1. set device offline
>2. synchronize the kernel
>3. finish all pending commands
>
>So far with me?
>The LLDD could now forget about the device and be done with it.
>However there's a problem left. The device may come back.
>What happens if a device with the same ID is reconnected?
>
>	Regards
>		Oliver
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
>  
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: A different look at block device hotswap in the Linux kernel
  2003-01-23 20:41                                                         ` A different look at block device hotswap in the Linux kernel Steven Dake
@ 2003-01-23 21:07                                                           ` Matthew Jacob
  2003-01-23 21:06                                                             ` Steven Dake
  2003-01-24  0:07                                                           ` Oliver Neukum
  1 sibling, 1 reply; 106+ messages in thread
From: Matthew Jacob @ 2003-01-23 21:07 UTC (permalink / raw)
  To: Steven Dake
  Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel,
	Linux SCSI list

>
> The key is that the removal request should come from the top, not the
> bottom.  If someone is stupid enough to surprise remove a device (ie:
> unplug their USB SCSI device while the device is in use by the OS), they
> get what they deserve (I/O errors, dirty OS data, queued up requests
> which never shut down).  If they tell the OS that the device is going to
> be removed, so it may flush the device and shut down I/O to the device,
> the request should be granted on all accounts (expected removal).
>

Hmm? Windows and OS/X cope with this just fine.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: A different look at block device hotswap in the Linux kernel
  2003-01-23 21:07                                                           ` Matthew Jacob
@ 2003-01-23 21:06                                                             ` Steven Dake
  2003-01-23 21:16                                                               ` Matthew Jacob
  0 siblings, 1 reply; 106+ messages in thread
From: Steven Dake @ 2003-01-23 21:06 UTC (permalink / raw)
  To: mjacob
  Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel,
	Linux SCSI list

I cant speak about OS/X, but I have crashed windows several times (BSOD) 
while hot removing a USB SCSI CDROM.  As you will notice, when you run 
windows and attach a device, there is a program that is started that 
allows you to notify the os of the removal so that it may properly 
remove the device from the OS instead of it being yanked.

Thanks
-steve

Matthew Jacob wrote:

>>The key is that the removal request should come from the top, not the
>>bottom.  If someone is stupid enough to surprise remove a device (ie:
>>unplug their USB SCSI device while the device is in use by the OS), they
>>get what they deserve (I/O errors, dirty OS data, queued up requests
>>which never shut down).  If they tell the OS that the device is going to
>>be removed, so it may flush the device and shut down I/O to the device,
>>the request should be granted on all accounts (expected removal).
>>
>>    
>>
>
>Hmm? Windows and OS/X cope with this just fine.
>
>
>
>
>  
>


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: A different look at block device hotswap in the Linux kernel
  2003-01-23 21:06                                                             ` Steven Dake
@ 2003-01-23 21:16                                                               ` Matthew Jacob
  0 siblings, 0 replies; 106+ messages in thread
From: Matthew Jacob @ 2003-01-23 21:16 UTC (permalink / raw)
  To: Steven Dake
  Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel,
	Linux SCSI list

Oh, well. I've pulled my camera in the middle of reads and just got the
usual whininess.

I think I was reacting to your "get what they deserve" comment. The end
goal of USB should probably *be* an alert that said "oh, dear, that
wasn't helpful- please put that memory stick back so I can finish
writing  it". The message "die, heathen dog luser!" is not exactly the
right idea.

In the matrix of outcomes of pulling a disk (or a fibre channel cable)
in the middle of I/O, there are many entries that are not recoverable,
many entries are hard to recover from, and many that are easy. This
should be irrelevant to the basic policy decision as to how you want
your system to be used- do you want it to require intervention so that
it is "safe" to change h/w? do you want I/O to autorestart after
(temporary) h/w topology changes? Have these questions been answered or
can they be answered via policies?

On Thu, 23 Jan 2003, Steven Dake wrote:

> I cant speak about OS/X, but I have crashed windows several times (BSOD)
> while hot removing a USB SCSI CDROM.  As you will notice, when you run
> windows and attach a device, there is a program that is started that
> allows you to notify the os of the removal so that it may properly
> remove the device from the OS instead of it being yanked.
>
> Thanks
> -steve
>
> Matthew Jacob wrote:
>
> >>The key is that the removal request should come from the top, not the
> >>bottom.  If someone is stupid enough to surprise remove a device (ie:
> >>unplug their USB SCSI device while the device is in use by the OS), they
> >>get what they deserve (I/O errors, dirty OS data, queued up requests
> >>which never shut down).  If they tell the OS that the device is going to
> >>be removed, so it may flush the device and shut down I/O to the device,
> >>the request should be granted on all accounts (expected removal).
> >>
> >>
> >>
> >
> >Hmm? Windows and OS/X cope with this just fine.
> >
> >
> >
> >
> >
> >
>
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: A different look at block device hotswap in the Linux kernel
  2003-01-23 20:41                                                         ` A different look at block device hotswap in the Linux kernel Steven Dake
  2003-01-23 21:07                                                           ` Matthew Jacob
@ 2003-01-24  0:07                                                           ` Oliver Neukum
  2003-01-24  0:21                                                             ` Matthew Jacob
  2003-01-24  0:54                                                             ` Steven Dake
  1 sibling, 2 replies; 106+ messages in thread
From: Oliver Neukum @ 2003-01-24  0:07 UTC (permalink / raw)
  To: Steven Dake
  Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

Am Donnerstag, 23. Januar 2003 21:41 schrieb Steven Dake:
> Oliver and others,
>
> In regards to hotswap, any real operating system should be _told_ that a
> block device is going to be removed from the top.  There are several
> reasons.

Users don't do what they should. It is as simple as that.
The hotplugging busses are supposed to handle that.

> 1) File mounts should be removed from the filesystem layer
> 2) files accessing block devices directly should be terminated
> 3) raid members using that block device should be hot removed
> 4) I'm sure you can think of others :)
>
> The key is that the removal request should come from the top, not the
> bottom.  If someone is stupid enough to surprise remove a device (ie:

No! You have to be able to handle a sudden failure. If you don't do this
you are already buggy. Hardware doesn't send advance notification before
failing. Data loss will occur. It's unavoidable. Anything else must not happen.
And a failure of hardware can only be recognised at the layer closest to
the hardware in the generic case.

> The device driver should not be responsible for managing hotswap in any
> regard.  Its only purpose should be to tell the block device removal

Yes.

> layer that a surprise extraction was initiated such that the block
> device removal code can ask the mid layer drivers to shut down error
> correction routines to the device and dump its pending I/O queue and
> clean up after the device.  The main advantage of this technique is

Yes. But not ask. Demand. There's no asking here. Do or die.

> If you think about what your suggesting, your suggesting that the LLDD
> tells the scsi layer that the device is gone, that then times out errors
> and leaves the filesystem and sys_open/close file tables, and RAID
> layers in a state of disarray.  We don't want the LLDD knowing about the
> RAID system and whether it should tell the RAID layer to hot remove, do we?

I want:
LLDD to SCSI: device is gone
SCSI to LLDD: Ok. I'll handle from here on.
LLDD: OK. I am gone. And won't have any contact until the next device is
plugged in.

The process can be somewhat more complicated, under some conditions:
- it never fails
- it is done within a finite, bounded, reasonable time

	Regards
		Oliver



-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: A different look at block device hotswap in the Linux kernel
  2003-01-24  0:07                                                           ` Oliver Neukum
@ 2003-01-24  0:21                                                             ` Matthew Jacob
  2003-01-24  7:53                                                               ` David Brownell
  2003-01-24  0:54                                                             ` Steven Dake
  1 sibling, 1 reply; 106+ messages in thread
From: Matthew Jacob @ 2003-01-24  0:21 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Steven Dake, Luben Tuikov, Alan Stern, David Brownell,
	Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel,
	Linux SCSI list

> I want:
> LLDD to SCSI: device is gone
> SCSI to LLDD: Ok. I'll handle from here on.
> LLDD: OK. I am gone. And won't have any contact until the next device is
> plugged in.
>
> The process can be somewhat more complicated, under some conditions:
> - it never fails
> - it is done within a finite, bounded, reasonable time

Could this time limit be fixed (or parameterized) known to all LLDDs?
This would allow one to try and avoid flooding SCSI with detach/reattach
events for the 'same' device.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: A different look at block device hotswap in the Linux kernel
  2003-01-24  0:21                                                             ` Matthew Jacob
@ 2003-01-24  7:53                                                               ` David Brownell
  2003-01-24 15:26                                                                 ` Matthew Jacob
  0 siblings, 1 reply; 106+ messages in thread
From: David Brownell @ 2003-01-24  7:53 UTC (permalink / raw)
  To: mjacob
  Cc: Oliver Neukum, Steven Dake, Luben Tuikov, Alan Stern,
	Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel,
	Linux SCSI list

Matthew Jacob wrote:
>>I want:
>>LLDD to SCSI: device is gone
>>SCSI to LLDD: Ok. I'll handle from here on.
>>LLDD: OK. I am gone. And won't have any contact until the next device is
>>plugged in.
>>
>>...
> 
> 
> Could this time limit be fixed (or parameterized) known to all LLDDs?
> This would allow one to try and avoid flooding SCSI with detach/reattach
> events for the 'same' device.

And what exactly is the "same" device?  And who's keeping history
about devices that have previously been attached?  And, says the guy
who's full of questions, didn't Linus want to get rid of such history?

- Dave





^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: A different look at block device hotswap in the Linux kernel
  2003-01-24  7:53                                                               ` David Brownell
@ 2003-01-24 15:26                                                                 ` Matthew Jacob
  0 siblings, 0 replies; 106+ messages in thread
From: Matthew Jacob @ 2003-01-24 15:26 UTC (permalink / raw)
  To: David Brownell
  Cc: Oliver Neukum, Steven Dake, Luben Tuikov, Alan Stern,
	Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel,
	Linux SCSI list


> >>...
> >
> >
> > Could this time limit be fixed (or parameterized) known to all LLDDs?
> > This would allow one to try and avoid flooding SCSI with detach/reattach
> > events for the 'same' device.
>
> And what exactly is the "same" device?  And who's keeping history
> about devices that have previously been attached?  And, says the guy
> who's full of questions, didn't Linus want to get rid of such history?

Hrmm. That's a damned good point. I was going to say things like "the
FC HBA driver knows that device XYX left the fabric and now has
returned", but if XYZ left the fabric, why am I keeping track of it
still? Once gone, it's gone. I had convinced myself that if an FC device
(re)appears, it's not up to the HBA to say it's the same (the content
may have been changed even if the container tag is the same).

Hrm.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: A different look at block device hotswap in the Linux kernel
  2003-01-24  0:07                                                           ` Oliver Neukum
  2003-01-24  0:21                                                             ` Matthew Jacob
@ 2003-01-24  0:54                                                             ` Steven Dake
  2003-01-24  2:35                                                               ` [linux-usb-devel] " Matthew Dharm
  1 sibling, 1 reply; 106+ messages in thread
From: Steven Dake @ 2003-01-24  0:54 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list



>I want:
>LLDD to SCSI: device is gone
>SCSI to LLDD: Ok. I'll handle from here on.
>LLDD: OK. I am gone. And won't have any contact until the next device is
>plugged in.
>  
>
The downside of this approach is that the LLDD must now be able to 
detect insertions and removals when it may not be able to do so.  If it 
is able to do so, then fine, it can tell upper layers about it, but the 
actual control of removal of a device should occur higher up to fix 
several problems with the approach of having the LLDD manage the hotswap 
state of the device.

>The process can be somewhat more complicated, under some conditions:
>- it never fails
>- it is done within a finite, bounded, reasonable time
>
>	Regards
>		Oliver
>
>
>
>  
>


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: A different look at block device hotswap in the Linux kernel
  2003-01-24  0:54                                                             ` Steven Dake
@ 2003-01-24  2:35                                                               ` Matthew Dharm
  0 siblings, 0 replies; 106+ messages in thread
From: Matthew Dharm @ 2003-01-24  2:35 UTC (permalink / raw)
  To: Steven Dake
  Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell,
	Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list

[-- Attachment #1: Type: text/plain, Size: 1622 bytes --]

On Thu, Jan 23, 2003 at 05:54:57PM -0700, Steven Dake wrote:
> >I want:
> >LLDD to SCSI: device is gone
> >SCSI to LLDD: Ok. I'll handle from here on.
> >LLDD: OK. I am gone. And won't have any contact until the next device is
> >plugged in.
> >  
> >
> The downside of this approach is that the LLDD must now be able to 
> detect insertions and removals when it may not be able to do so.  If it 
> is able to do so, then fine, it can tell upper layers about it, but the 
> actual control of removal of a device should occur higher up to fix 
> several problems with the approach of having the LLDD manage the hotswap 
> state of the device.

Huh?

Aren't we talking about a hotplug scenario?  How can you talk about the
'LLDD must now be able to detect... when it may not be able to do so.'?

Oh... I see.  We keep talking about devices.  I'm trying to hotswap an
entire host, which is mapped to a single USB device.

But the theory is the same, really.

In the end, you can only hotswap something that is hotswapable.  That means
that the driver has to support the hotswap system, whatever it is.

If you can't support hotswap detection, then this entire scenario is
reduced to 'what happens if I blow a FET on my HD', because it's the exact
same thing.  Recovering from fatal error is a separate discussion.

Matt

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

A:  The most ironic oxymoron wins ...
DP: "Microsoft Works"
A:  Uh, okay, you win.
					-- A.J. & Dust Puppy
User Friendly, 1/18/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-21 18:16                                         ` Luben Tuikov
  2003-01-21 19:00                                           ` Oliver Neukum
@ 2003-01-22 21:30                                           ` David Brownell
  1 sibling, 0 replies; 106+ messages in thread
From: David Brownell @ 2003-01-22 21:30 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Oliver Neukum, Matthew Dharm, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Luben Tuikov wrote:
> 
> When the Low Level Device Driver (LLDD), being the transport portal,
> notices that the device is going away or has gone away from the
> ``fabric'' (wlg), it will fire a device-gone event with the kernel.
> *Not* necessarily with SCSI Core, in fact I'd rather it didn't,
> but with a well defined kernel entry for device-gone events.
> 
> At the same time the LLDD will start returning TARGET gone, or
> whatever is appropriate to newly queued commands, and error out
> all internally queued commands (if it does it's own queuing).
> (I've seen this work nicely on mount and read/write(2) and fsck.)
> 
> I.e. the ``synchronization'' has started already by the LLDD erroring
> out commands, new and queued.

This model I like, though FWIW the USB code has pretty messy
handling of the (a) cancel queued requests and (b) reject new
requests parts of that model.  All the updates to handle that
would be transparent to device drivers (other than HCDs); we've
discussed it a bit.

> All the while the kernel has started higher level cleaning up,
> decrementing ref counts, etc, 

... Invoking some hotplug event that might unmount filesystems,
so _all_ relevant kernel state can be cleaned up ...

> But there's no such thing as ``waiting around indefinitely'' or
> ``blocking wait'' as you've suggested in some of your emails.

Right.  Though I can wish the driver model core actually used
its "enum device_state" and had an instance variable of that
type in "struct device".  That'd help the bus level driver
(for SCSI, an LLDD or LLD; for USB, an HCD) with (a) and (b).

The "no waiting indefinitely" is in part that because knowing
both (a) and (b) happen means you know the device quiesces.

- Dave

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 20:08                         ` David Brownell
  2003-01-20 20:48                           ` [linux-usb-devel] " Oliver Neukum
@ 2003-01-20 22:16                           ` Luben Tuikov
  2003-01-20 22:51                             ` David Brownell
  1 sibling, 1 reply; 106+ messages in thread
From: Luben Tuikov @ 2003-01-20 22:16 UTC (permalink / raw)
  To: David Brownell
  Cc: Matthew Dharm, Oliver Neukum, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

David Brownell wrote:
> Luben Tuikov wrote:
> 
>>> The way this should work is that the LLD calls scsi_remove_device(), and
>>> that cuts off the flow of commands.  The LLD can promise to error-out 
>>> any
>>> pending commands in the device command queue.
>>
>>
>>
>> I take it you mean that the transport will tell the LLDD that the device
>> is gone and it (LLDD) call the one above, SCSI Core to remove the device.
>>
>> Hmm, more thinking needs to be done here, as shouldn't this be handled
>> by hotplugging? I.e. Targets do not *initiate* events.
> 
> 
> Not exactly, but the bus driver ("transport"?) certainly does initiate
> reports like "here's a new device on the bus" or "that device is gone".
> That's when hotplugging kicks in (both in-kernel and in-userland).
> 
> And the only way to access a device ("target") on the bus is to give a
> request to that bus driver.  If, when servicing that request, the bus
> driver notices the device is gone ... that can act a lot like a device
> initiating a "device gone" event would look.

David, when I said ``... the transport will tell the LLDD that the
device ...'' this is *exactly* what I meant.  You're just repeating
it here in a more broken-down way.

By transport I mean USB, FC, SPI, etc; LLDD is the transport portal
and the initiator (aka the initiator port).  This terminology is not
really that new, but still not that old, and described in SAM-3.

>> The transport can notify that the device is gone, but an ULP entity will
>> call scsi_remove_device() not the other way around.
> 
> 
> That's how USB works today:  khubd shuts things down.  Device drivers
> get disconnect() callbacks, just as when their modules are removed.

Pardon me, I'm not very familiar with the USB subsystem, but this only
makes sense -- why would anyone do it any other way... :-)

> EXCEPT that "khubd" is part of usbcore (roughly analagous to parts
> of the scsi mid-layer) ... so the drivers acting as host side proxies
> for the target hardware ("usb device") are purely reactive.  Their
> only roles in hotplug scenarios are to bind to devices (when a new
> one appears, using probe callbacks) or unbind from them (when one
> goes away, using disconnect callbacks).

Very nice.

> Those disconnect() callbacks have a few key responsibilities, very
> much including shutting down the entire higher level I/O queue to
> that device.  I think you're saying that SCSI drivers don't have
> such a responsibility (unlike USB or PCI) ... if so, that would
> seem to be worth changing.

We just cannot let a transport event just wipe out a device,
without consulting hotplugging first -- think security.

SCSI drivers' (LLDD) responsibility is changing.  This is inevitable,
due to the reorganization of SAM-3 and SPC-3.  There's no more
such a thing as a ``bus'' in SCSI, e.g. ``Bus'' *may* be a
concept of the transport, and then again it may not.

General:
--------
SCSI was never designed to support Target initiated events.
SAM-3 has no provision for it, except passively when the next
command status is returned (e.g. UA).

For this reason, device removal is *transport* related event --
it has *nothing* to do with the SCSI target/target device, except
that it's gone :-) .

Being pedantic, this would be a /transport initiated event/ .
When this event takes place, the LLDD will notice it, and
let the kernel know about it, via a callback, all the while
the LLDD will return TARGET error (since it's gone), until
is has been told slave_destroy(), after which it should
never be queried of it, and if it is it should return the same
error.

That is, when a transport event takes place, the LLDD doesn't
have to ``run to'' SCSI Core right away. Just let the kernel
know about this event, and start returning errors, on newly
queued commands.

The kernel will decide what to do about this device going away,
i.e. hotplugging, sysop notification, etc.

I guess we're crossing in this discussion in such a way, just
because of USB and SCSI crossing here.  But if we think that
USB is the transport and that it could also be FC, SPI, SSA,
iSCSI, then a general framework of the workings is inevitable.

I.e. when talking about LLDDs we'd concentrate less on ``Target''
and more on ``transport''.

-- 
Luben

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 22:16                           ` Luben Tuikov
@ 2003-01-20 22:51                             ` David Brownell
  2003-01-20 23:27                               ` Oliver Neukum
  0 siblings, 1 reply; 106+ messages in thread
From: David Brownell @ 2003-01-20 22:51 UTC (permalink / raw)
  To: Luben Tuikov
  Cc: Matthew Dharm, Oliver Neukum, Mike Anderson, Greg KH,
	linux-usb-devel, Linux SCSI list

Luben Tuikov wrote:

> David, when I said ``... the transport will tell the LLDD that the
> device ...'' this is *exactly* what I meant.  You're just repeating
> it here in a more broken-down way.

OK

> By transport I mean USB, FC, SPI, etc; LLDD is the transport portal
> and the initiator (aka the initiator port).  This terminology is not
> really that new, but still not that old, and described in SAM-3.

I was hoping for something described in the 2.5.58 kernel docs,
which only talks about LLD (Documentation/scsi) except in one case
(looked like a typo) ... I remember SAM-3 as a kind of missile!

> We just cannot let a transport event just wipe out a device,
> without consulting hotplugging first -- think security.

Certainly "device gone" would be an auditable event, but this is
primarily an integrity issue:  don't free objects until other
components have stopped using them.

If any components attach security policies to that "gone" state
transition, that'd be atypical but purely their own business.
(Like a transport erasing session master keys ... most transports
wouldn't have them, and would likely erase them as soon as the
device is known to be gone, no hotplug involved.)

> That is, when a transport event takes place, the LLDD doesn't
> have to ``run to'' SCSI Core right away. Just let the kernel
> know about this event, and start returning errors, on newly
> queued commands.
> 
> The kernel will decide what to do about this device going away,
> i.e. hotplugging, sysop notification, etc.

Sounds right. Except that it'd normally be the SCSI core that
we "let" know about the event. (Not always, I can imagine that
some transports might be able to kick in recovery procedures
and find some other path for accessing the device.  But in
such cases, SCSI might never see the device as "gone" ... )

- Dave

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58
  2003-01-20 22:51                             ` David Brownell
@ 2003-01-20 23:27                               ` Oliver Neukum
  0 siblings, 0 replies; 106+ messages in thread
From: Oliver Neukum @ 2003-01-20 23:27 UTC (permalink / raw)
  To: David Brownell, Luben Tuikov
  Cc: Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel,
	Linux SCSI list

> > By transport I mean USB, FC, SPI, etc; LLDD is the transport portal
> > and the initiator (aka the initiator port).  This terminology is not
> > really that new, but still not that old, and described in SAM-3.
>
> I was hoping for something described in the 2.5.58 kernel docs,
> which only talks about LLD (Documentation/scsi) except in one case
> (looked like a typo) ... I remember SAM-3 as a kind of missile!

Possibly the view people at IBM have of SCSI is more comprehensive
than ordinarily used. We are probably better of talking about low
level drivers (LLDs). Whether these drive devices, busses or
transport mechanisms is not really relevant here.

> > We just cannot let a transport event just wipe out a device,
> > without consulting hotplugging first -- think security.
>
> Certainly "device gone" would be an auditable event, but this is
> primarily an integrity issue:  don't free objects until other
> components have stopped using them.

Right. There's nothing wrong with a LLD having to wait for
a _limited_ time by making a blocking call to the midlayer.
But that call must finish within a limited,reasonable time and it must
succeed. The complexity for handling hotunplugging belongs
squarely in a centralised place, but not every LLD. 

It is true that currently hotplugging is a generic safety problem
(unless you use devfs).
The problem is reuse of device nodes leading to a race with
the hotplug skripts. Old permissions may for a time be applied
to a new device. But that just means that the hotplugging
user space notification model is incomplete. Now the simple fix with
a callback into the LLD which would allow simply waiting for the
unplugging skript to run is true madness. You cannot stall a hotpluggable
subsystem waiting on a skript, neither can you do sane error handling.
Problem should be fixed where they arise, eg. "lock" a new hotplugged
device.

> If any components attach security policies to that "gone" state
> transition, that'd be atypical but purely their own business.
> (Like a transport erasing session master keys ... most transports
> wouldn't have them, and would likely erase them as soon as the
> device is known to be gone, no hotplug involved.)

Right.

> > That is, when a transport event takes place, the LLDD doesn't
> > have to ``run to'' SCSI Core right away. Just let the kernel
> > know about this event, and start returning errors, on newly
> > queued commands.
> >
> > The kernel will decide what to do about this device going away,
> > i.e. hotplugging, sysop notification, etc.
>
> Sounds right. Except that it'd normally be the SCSI core that
> we "let" know about the event. (Not always, I can imagine that
> some transports might be able to kick in recovery procedures
> and find some other path for accessing the device.  But in
> such cases, SCSI might never see the device as "gone" ... )

I must disagree. There's no decision involved here. You handle
tasks having the device open, clean things up, free up the resources
and fire off a user space notification.
Decisions get made in user space where policy can be reasonably
implemented.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: A different look at block device hotswap in the Linux kernel
@ 2003-01-24 16:36 Cress, Andrew R
  2003-01-24 18:01 ` Bryan Henderson
  0 siblings, 1 reply; 106+ messages in thread
From: Cress, Andrew R @ 2003-01-24 16:36 UTC (permalink / raw)
  To: 'mjacob@feral.com', David Brownell
  Cc: Oliver Neukum, Steven Dake, Luben Tuikov, Alan Stern,
	Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel,
	Linux SCSI list

My $.02:

The comparing of a saved device list snapshot with the current device should
be the responsibility of a user-space daemon, provided that the kernel
exposes enough information  to uniquely identify the devices (like serial
numbers, or some other UID if no ser num exists).  

The kernel would assume that the device is new (not the same) unless told so
by a daemon that is watching.  

Andy

-----Original Message-----
From: Matthew Jacob [mailto:mjacob@feral.com] 
Sent: Friday, January 24, 2003 10:26 AM
To: David Brownell
Cc: Oliver Neukum; Steven Dake; Luben Tuikov; Alan Stern; Matthew Dharm;
Mike Anderson; Greg KH; linux-usb-devel@lists.sourceforge.net; Linux SCSI
list
Subject: Re: A different look at block device hotswap in the Linux kernel

> >>...
> >
> >
> > Could this time limit be fixed (or parameterized) known to all LLDDs?
> > This would allow one to try and avoid flooding SCSI with detach/reattach
> > events for the 'same' device.
>
> And what exactly is the "same" device?  And who's keeping history
> about devices that have previously been attached?  And, says the guy
> who's full of questions, didn't Linus want to get rid of such history?

Hrmm. That's a damned good point. I was going to say things like "the
FC HBA driver knows that device XYX left the fabric and now has
returned", but if XYZ left the fabric, why am I keeping track of it
still? Once gone, it's gone. I had convinced myself that if an FC device
(re)appears, it's not up to the HBA to say it's the same (the content
may have been changed even if the container tag is the same).

Hrm.

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: A different look at block device hotswap in the Linux kernel
  2003-01-24 16:36 A different look at block device hotswap in the Linux kernel Cress, Andrew R
@ 2003-01-24 18:01 ` Bryan Henderson
  2003-01-24 18:09   ` Matthew Jacob
  0 siblings, 1 reply; 106+ messages in thread
From: Bryan Henderson @ 2003-01-24 18:01 UTC (permalink / raw)
  To: Cress, Andrew R
  Cc: andmike, David Brownell, Greg KH, Linux SCSI list,
	linux-usb-devel, Luben Tuikov, Matthew Dharm,
	'mjacob@feral.com', Oliver Neukum, Steven Dake,
	Alan Stern





>The comparing of a saved device list snapshot with the current device
should
>be the responsibility of

>From a usability standpoint, I don't think any such comparing should be
done by anyone.  When I unplug a device and then plug it in again, I want a
total reset.  I'm willing to take my lumps if I unplug something that isn't
in a state to be safely unplugged.

It's like when I pull the power plug because my system is totally hosed and
I want to start over.  I know I can cause damage by doing that, but I would
be upset if the new system booted back to the broken state it was in when I
unplugged it.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* RE: A different look at block device hotswap in the Linux kernel
  2003-01-24 18:01 ` Bryan Henderson
@ 2003-01-24 18:09   ` Matthew Jacob
  0 siblings, 0 replies; 106+ messages in thread
From: Matthew Jacob @ 2003-01-24 18:09 UTC (permalink / raw)
  To: Bryan Henderson
  Cc: Cress, Andrew R, andmike, David Brownell, Greg KH,
	Linux SCSI list, linux-usb-devel, Luben Tuikov, Matthew Dharm,
	Oliver Neukum, Steven Dake, Alan Stern

>
> It's like when I pull the power plug because my system is totally hosed and
> I want to start over.  I know I can cause damage by doing that, but I would
> be upset if the new system booted back to the broken state it was in when I
> unplugged it.

I had this conversation with doug offlist- this is a policy choice. You
may want your device to reattach as totally new. You may, on the other
hand, want your device to resume where you left off. I can see valid
reasons for wanting either behaviour (but it can't/shouldn't be deduced
by the OS).



^ permalink raw reply	[flat|nested] 106+ messages in thread

end of thread, other threads:[~2003-03-01  1:41 UTC | newest]

Thread overview: 106+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <10426732153816@kroah.com>
     [not found] ` <10426732212871@kroah.com>
     [not found]   ` <20030116093112.B29001@one-eyed-alien.net>
     [not found]     ` <20030116173539.GA31235@kroah.com>
2003-01-16 19:43       ` [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 Matthew Dharm
2003-01-16 19:53         ` Greg KH
     [not found]         ` <20030116195306.GA32697@kroah.com>
2003-01-16 20:10           ` Linus Torvalds
2003-01-16 20:43             ` greg kh
2003-01-16 21:41             ` Linus Torvalds
2003-01-16 22:51             ` Matthew Dharm
2003-01-16 20:40           ` David Brownell
2003-01-16 20:48             ` Mike Anderson
2003-01-16 23:43               ` Oliver Neukum
2003-01-17  8:50                 ` Mike Anderson
2003-01-17 10:55                   ` Oliver Neukum
2003-01-17 15:06                     ` Alan Stern
2003-01-17 18:54                     ` Matthew Dharm
2003-01-17 20:25                       ` Mike Anderson
2003-01-17 22:07                         ` Oliver Neukum
2003-01-17 20:26                       ` [linux-usb-devel] " Oliver Neukum
2003-01-17 20:49                         ` Mike Anderson
2003-01-20 17:36                       ` Luben Tuikov
2003-01-20 18:23                         ` Oliver Neukum
2003-01-20 18:56                           ` Luben Tuikov
2003-01-20 19:10                             ` [linux-usb-devel] " Oliver Neukum
2003-01-20 19:50                             ` David Brownell
2003-01-21  3:31                           ` Alan
2003-01-21  7:17                             ` Oliver Neukum
2003-01-21 11:57                               ` [linux-usb-devel] " Douglas Gilbert
2003-01-21 13:48                                 ` Oliver Neukum
2003-01-21 18:22                                   ` Luben Tuikov
2003-01-21 13:30                             ` James Bottomley
2003-01-20 20:08                         ` David Brownell
2003-01-20 20:48                           ` [linux-usb-devel] " Oliver Neukum
2003-01-20 21:24                             ` David Brownell
2003-01-20 21:51                               ` [linux-usb-devel] " Oliver Neukum
2003-01-20 22:26                                 ` David Brownell
2003-01-20 23:00                                   ` Oliver Neukum
2003-01-21  0:44                                     ` David Brownell
2003-01-21  0:50                                       ` Oliver Neukum
2003-01-21 18:16                                         ` Luben Tuikov
2003-01-21 19:00                                           ` Oliver Neukum
2003-01-21 20:02                                             ` [linux-usb-devel] " Luben Tuikov
2003-01-21 21:02                                               ` Alan Stern
2003-01-22 21:50                                                 ` Luben Tuikov
2003-01-22 22:46                                                   ` Oliver Neukum
2003-01-23 17:46                                                     ` Luben Tuikov
2003-01-23 18:19                                                       ` Oliver Neukum
2003-01-23 19:07                                                         ` Luben Tuikov
2003-01-23 19:40                                                           ` Oliver Neukum
2003-01-23 20:28                                                             ` Doug Ledford
2003-01-23 20:59                                                               ` Oliver Neukum
2003-01-23 21:34                                                                 ` Doug Ledford
2003-01-23 22:39                                                                   ` Oliver Neukum
2003-01-23 23:23                                                                     ` Doug Ledford
2003-01-23 23:25                                                                     ` Matthew Dharm
2003-01-24 15:34                                                                       ` Alan Stern
2003-01-24 16:06                                                                         ` Oliver Neukum
2003-01-24 17:58                                                                         ` [linux-usb-devel] " Doug Ledford
2003-01-24 19:00                                                                         ` Luben Tuikov
2003-01-24 22:23                                                                           ` Oliver.Neukum
2003-01-24 19:10                                                                       ` Luben Tuikov
2003-01-24 19:56                                                                         ` [linux-usb-devel] " Alan Stern
2003-01-24 20:11                                                                           ` Luben Tuikov
2003-01-24 21:09                                                                           ` Luben Tuikov
2003-01-24 21:55                                                                             ` Alan Stern
2003-01-24 22:03                                                                               ` Luben Tuikov
2003-01-24 23:21                                                                               ` Mike Anderson
2003-01-24 21:48                                                                       ` Doug Ledford
2003-01-24 22:59                                                                         ` Mike Anderson
2003-01-24 23:17                                                                           ` [linux-usb-devel] " Doug Ledford
2003-01-25  0:24                                                                           ` Luben Tuikov
2003-01-25  1:35                                                                             ` Mike Anderson
2003-01-24 23:25                                                                         ` Matthew Dharm
2003-01-25  0:05                                                                           ` Doug Ledford
2003-01-25  0:45                                                                             ` Matthew Dharm
2003-01-25  1:07                                                                               ` Doug Ledford
2003-02-02 18:13                                                                                 ` Matthew Dharm
2003-02-02 20:06                                                                                   ` Matthew Dharm
2003-02-03 17:17                                                                                     ` Mike Anderson
2003-02-16 21:18                                                                                       ` Matthew Dharm
2003-02-17 19:37                                                                                         ` Mike Anderson
2003-02-17 19:51                                                                                           ` Patrick Mansfield
2003-02-23  7:48                                                                                           ` Matthew Dharm
2003-02-26 23:37                                                                                             ` Mike Anderson
2003-02-27  1:10                                                                                               ` Matthew Dharm
2003-02-27  6:37                                                                                                 ` Mike Anderson
2003-02-27 19:32                                                                                                   ` Matthew Dharm
2003-03-01  1:41                                                                                                     ` Matthew Dharm
2003-02-02  3:49                                                                             ` Matthew Dharm
2003-01-25  1:24                                                                           ` Luben Tuikov
2003-01-24  0:15                                                               ` Patrick Mansfield
2003-01-24  8:33                                                               ` David Brownell
2003-01-23 20:41                                                         ` A different look at block device hotswap in the Linux kernel Steven Dake
2003-01-23 21:07                                                           ` Matthew Jacob
2003-01-23 21:06                                                             ` Steven Dake
2003-01-23 21:16                                                               ` Matthew Jacob
2003-01-24  0:07                                                           ` Oliver Neukum
2003-01-24  0:21                                                             ` Matthew Jacob
2003-01-24  7:53                                                               ` David Brownell
2003-01-24 15:26                                                                 ` Matthew Jacob
2003-01-24  0:54                                                             ` Steven Dake
2003-01-24  2:35                                                               ` [linux-usb-devel] " Matthew Dharm
2003-01-22 21:30                                           ` [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 David Brownell
2003-01-20 22:16                           ` Luben Tuikov
2003-01-20 22:51                             ` David Brownell
2003-01-20 23:27                               ` Oliver Neukum
2003-01-24 16:36 A different look at block device hotswap in the Linux kernel Cress, Andrew R
2003-01-24 18:01 ` Bryan Henderson
2003-01-24 18:09   ` Matthew Jacob

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox